Genes for the synthesis of antipathogenic substances

ABSTRACT

The present invention is directed to the production of an antipathogenic substance (APS) in a host via recombinant expression of the polypeptides needed to biologically synthesize the APS. Genes encoding polypeptides necessary to produce particular antipathogenic substances are provided, along with methods for identifying and isolating genes needed to recombinantly biosynthesize any desired APS. The cloned genes may be transformed and expressed in a desired host organisms to produce the APS according to the invention for a variety of purposes, including protecting the host from a pathogen, developing the host as a biocontrol agent, and producing large, uniform amounts of the APS.

This application is a continuation-in-part of Ser. No. 08/087,636, filed1 July 1993, now abandoned, which is itself a continuation-in-part ofSer. No. 07/908,284, filed 2 Jul. 1992, now abandoned, which is itself acontinuation-in-part of Ser. No. 07/570,184, filed 20 Aug. 1990 (nowabandoned). This application is also a continuation-in-part ofinternational PCT application No. US93/07954 filed on 24 Aug. 1993 (WO94/05793), which is itself a continuation-in-part of Ser. No.07/937,648, filed 31 Aug. 1992 (now abandoned).

FIELD OF THE INVENTION

The present invention relates generally to the protection of hostorganisms against pathogens, and more particularly to the protection ofplants against phytopathogens. In one aspect it provides transgenicplants which have enhanced resistance to phytopathogens and biocontrolorganisms with enhanced biocontrol properties. It further providesmethods for protecting plants against phytopathogens and methods for theproduction of antipathogenic substances.

BACKGROUND OF THE INVENTION

Plants routinely become infected by fungi and bacteria, and manymicrobial species have evolved to utilize the different niches providedby the growing plant. Some phytopathogens have evolved to infect foliarsurfaces and are spread through the air, from plant-to-plant contact orby various vectors, whereas other phytopathogens are soil-borne andpreferentially infect roots and newly germinated seedlings. In additionto infection by fungi and bacteria, many plant diseases are caused bynematodes which are soil-borne and infect roots, typically causingserious damage when the same crop species is cultivated for successiveyears on the same area of ground.

Plant diseases cause considerable crop loss from year to year resultingboth in economic hardship to farmers and nutritional deprivation forlocal populations in many parts of the world. The widespread use offungicides has provided considerable security against phytopathogenattack, but despite $1 billion worth of expenditure on fungicides,worldwide crop losses amounted to approximately 10% of crop value in1981 (James, Seed Sci. & Technol. 9: 679-685 (1981). The severity of thedestructive process of disease depends on the aggressiveness of thephytopathogen and the response of the host, and one aim of most plantbreeding programs is to increase the resistance of host plants todisease. Novel gene sources and combinations developed for resistance todisease have typically only had a limited period of successful use inmany crop-pathogen systems due to the rapid evolution of phytopathogensto overcome resistance genes. In addition, there are several documentedcases of the evolution of fungal strains which are resistant toparticular fungicides. As early as 1981, Fletcher and Wolfe (Proc. 1981Brit. Crop Prot. Conf. (1981)) contended that 24% of the powdery mildewpopulations from spring barley, and 53% from winter barley showedconsiderable variation in response to the fungicide triadimenol and thatthe distribution of these populations varied between barley varietieswith the most susceptible variety also giving the highest incidence ofless susceptible fungal types. Similar variation in the sensitivity offungi to fungicides has been documented for wheat mildew (also totriadimenol), Botrytis (to benomyl), Pyrenophora (to organomercury),Pseudocercosporella (to MBC-type fungicides) and Mycosphaerellafijiensis to triazoles to mention just a few (Jones and Clifford; CerealDiseases, John Wiley, 1983). Diseases caused by nematodes have also beencontrolled successfully by pesticide application. Whereas mostfungicides are relatively harmless to mammals and the problems withtheir use lie in the development of resistance in target fungi, themajor problem associated with the use of nematicides is their relativelyhigh toxicity to mammals. Most nematicides used to control soilnematodes are of the carbamate, organochlorine or organophosphorousgroups and must be applied to the soil with particular care.

In some crop species, the use of biocontrol organisms has been developedas a further alternative to protect crops. Biocontrol organisms have theadvantage of being able to colonize and protect parts of the plantinaccessible to conventional fungicides. This practice developed fromthe recognition that crops grown in some soils are naturally resistantto certain fungal phytopathogens and that the suppressive nature ofthese soils is lost by autoclaving. Furthermore, it was recognized thatsoils which are conducive to the development of certain diseases couldbe rendered suppressive by the addition of small quantities of soil froma suppressive field (Scher et al. Phytopathology 70: 412-417 (1980).Subsequent research demonstrated that root colonizing bacteria wereresponsible for this phenomenon, now known as biological disease control(Baker et al. Biological Control of Plant Pathogens, Freeman Press, SanFrancisco, 1974). In many cases, the most efficient strains ofbiological disease controlling bacteria are of the species Pseudomonasfluorescens (Weller et al. Phytopathology 73: 463-469 (1983); Kloepperet al. Phytopathology 71: 1020-1024 (1981)). Important plant pathogensthat have been effectively controlled by seed inoculation with thesebacteria include Gaemannomyces graminis, the causative agent of take-allin wheat (Cook et al. Soil Biol. Biochem 8: 269-273 (1976)) and thePythium and Rhizoctonia phytopathogens involved in damping off of cotton(Howell et al. Phytopathology 69: 480-482 (1979)). Several biologicaldisease controlling Pseudomonas strains produce antibiotics whichinhibit the growth of fungal phytopathogens (Howell et al.Phytopathology 69: 480-482 (1979); Howell et al. Phytopathology 70:712-715 (1980)) and these have been implicated in the control of fungalphytopathogens in the rhizosphere. Although biocontrol was initiallybelieved to have considerable promise as a method of widespreadapplication for disease control, it has found application mainly in theenvironment of glasshouse crops where its utility in controllingsoil-borne phytopathogens is best suited for success. Large scale fieldapplication of naturally occurring microorganisms has not provenpossible due to constraints of microorganism production (they are oftenslow growing), distribution (they are often short lived) and cost (theresult of both these problems). In addition, the success of biocontrolapproaches is also largely limited by the identification of naturallyoccurring strains which may have a limited spectrum of efficacy. Someinitial approaches have also been taken to control nematodephytopathogens using biocontrol organisms. Although these approaches arestill exploratory, some Streptomyces species have been reported tocontrol the root knot nematode (Meliodogyne spp.) (WO 93/18135 toResearch Corporation Technology), and toxins from some Bacillusthuringiensis strains (such as israeliensis) have been shown to havebroad anti-nematode activity and spore or bacillus preparations may thusprovide suitable biocontrol opportunities Clip 0 352 052 to Mycogen, WO93/19604 to Research Corporation Technologies).

The traditional methods of protecting crops against disease, includingplant breeding for disease resistance, the continued development offungicides, and more recently, the identification of biocontrolorganisms, have all met with success. It is apparent, however, thatscientists must constantly be in search of new methods with which toprotect crops against disease. This invention provides novel methods forthe protection of plants against phytopathogens.

SUMMARY OF THE INVENTION

The present invention reveals the genetic basis for substances producedby particular microorganisms via a multi-gene biosynthetic pathway whichhave a deleterious effect on the multiplication or growth of plantpathogens. These substances include carbohydrate containing antibioticssuch as aminoglycosides, peptide antibiotics, nucleoside derivatives andother heterocyclic antibiotics containing nitrogen and/or oxygen,polyketides, macrocyclic lactones, and quinones.

The invention provides the entire set of genes required for recombinantproduction of particular antipathogenic substances in a host organism.It further provides methods for the manipulation of APS gene sequencesfor their expression in transgenic plants. The transgenic plants thusmodified have enhanced resistance to attack by phytopathogens. Theinvention provides methods for the cellular targeting of APS geneproducts so as to ensure that the gene products have appropriate spatiallocalization for the availability of the required substrate/s. Furtherprovided are methods for the enhancement of throughput through the APSmetabolic pathway by overexpression and overproduction of genes encodingsubstrate precursors.

The invention further provides a novel method for the identification andisolation of the genes involved in the biosynthesis of any particularAPS in a host organism.

The invention also describes improved biocontrol strains which produceheterologous APSs and which are efficacious in controlling soil-borneand seedling phytopathogens outside the usual range of the host.

Thus, the invention provides methods for disease control. These methodsinvolve the use of transgenic plants expressing APS biosynthetic genesand the use of biocontrol agents expressing APS genes.

The invention further provides methods for the production of APSs inquantities large enough to enable their isolation and use inagricultural formulations. A specific advantage of these productionmethods is the chirality of the molecules produced; production intransgenic organisms avoids the generation of populations of racemicmixtures, within which some enantiomers may have reduced activity.

DEFINITIONS

As used in the present application, the following terms have themeanings set out below. Antipathogenic Substance: A substance whichrequires one or more nonendogenous enzymatic activities foreign to aplant to be produced in a host where it does not naturally occur, whichsubstance has a deleterious effect on the multiplication or growth of apathogen (i.e. pathogen). By "nonendogenous enzymatic activities" ismeant enzymatic activities that do not naturally occur in the host wherethe antipathogenic substance does not naturally occur. A pathogen may bea fungus, bacteria, nematode, virus, viroid, insect or combinationthereof, and may be the direct or indirect causal agent of disease inthe host organism. An antipathogenic substance can prevent themultiplication or growth of a phytopathogen or can kill a phytopathogen.An antipathogenic substance may be synthesized from a substrate whichnaturally occurs in the host. Alternatively, an antipathogenic substancemay be synthesized from a substrate that is provided to the host alongwith the necessary nonendogenous enzymatic activities. An antipathogenicsubstance may be a carbohydrate containing antibiotic, a peptideantibiotic, a heterocyclic antibiotic containing nitrogen, aheterocyclic antibiotic containing oxygen, a heterocyclic antibioticcontaining nitrogen and oxygen, a polyketide, a macrocyclic lactone, anda quinone. Antipathogenic substance is abbreviated as "APS" throughoutthe text of this application.

Anti-phytopathogenic substance: An antipathogenic substance as hereindefined which has a deleterious effect on the multiplication or growthof a plant pathogen (i.e. phytopathogen).

Biocontrol agent: An organism which is capable of affecting the growthof a pathogen such that the ability of the pathogen to cause a diseaseis reduced. Biocontrol agents for plants include microorganisms whichare capable of colonizing plants or the rhizosphere. Such biocontrolagents include gram-negative microorganisms such as Pseudomonas,Enterobacter and Serratia, the gram-positive microorganism Bacillus andthe fungi Trichoderma and Gliocladium. Organisms may act as biocontrolagents in their native state or when they are genetically engineeredaccording to the invention.

Pathogen: Any organism which causes a deleterious effect on a selectedhost under appropriate conditions. Within the scope of this inventionthe term pathogen is intended to include fungi, bacteria, nematodes,viruses, viroids and insects.

Promoter or Regulatory DNA Sequence: An untranslated DNA sequence whichassists in, enhances, or otherwise affects the transcription,translation or expression of an associated structural DNA sequence whichcodes for a protein or other DNA product. The promoter DNA sequence isusually located at the 5' end of a translated DNA sequence, typicallybetween 20 and 100 nucleotides from the 5' end of the translation startsite.

Coding DNA Sequence: A DNA sequence that is translated in an organism toproduce a protein.

Operably Linked to/Associated With: Two DNA sequences which are"associated" or "operably linked" are related physically orfunctionally. For example, a promoter or regulatory DNA sequence is saidto be "associated with" a DNA sequence that codes for an RNA or aprotein if the two sequences are operably linked, or situated such thatthe regulator DNA sequence will affect the expression level of thecoding or structural DNA sequence.

Chimeric Construction/Fusion DNA Sequence: A recombinant DNA sequence inwhich a promoter or regulatory DNA sequence is operably linked to, orassociated with, a DNA sequence that codes for an mRNA or which isexpressed as a protein, such that the regulator DNA sequence is able toregulate transcription or expression of the associated DNA sequence. Theregulator DNA sequence of the chimeric construction is not normallyoperably linked to the associated DNA sequence as found in nature. Theterms "heterologous" or "non-cognate" are used to indicate a recombinantDNA sequence in which the promoter or regulator DNA sequence and theassociated DNA sequence are isolated from organisms of different speciesor genera.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1: Restriction map of the cosmid clone pCIB169 from Pseudomonasfluorescens carrying the pyrrolnitrin biosynthetic gene region.

FIG. 2: Insertion points of 30 independent Tn5 insertions along thelength of pCIB169 for the identification of the genes for pyrrolnitrinbiosynthesis.

FIG. 3: Restriction map of a 9.7 kb fragment of pCIB169 involved inpyrrolnitrin biosynthesis.

FIG. 4: Location of various subclones derived from pCIB169 isolated forsequence determination purposes.

FIG. 5: Localization of the four open reading frames (ORFs 1-4)responsible for pyrrolnitrin biosynthesis in strain MOCG134 on the ˜6 kbXbaI/NotI fragment of pCIB169.

FIG. 6: Location of the sites of disruption of ORFs 1-4 in thepyrrolnitrin gene cluster of MOCG134.

FIG. 7: Restriction map of the cosmid clone p98/1 from Sorangiumcellulosum carrying the soraphen biosynthetic gene region. The top linedepicts the restriction map of p98/1 and shows the position ofrestriction sites and their distance from the left edge in kilobases.Restriction sites shown include: B, Bam HI; Bg Bgl II; E, Eco RI; H,Hind III; Pv, Pvu I; Sm, Sma I. The boxes below the restriction mapdepict the location of the biosynthetic modules. The activity domainswithin each module are designated as follows: β-ketoacylsynthase (KS),Acyltransferase (AT), Ketoreductase (KR), Acyl Carrier Protein (ACP),Dehydratase (DH), Enoyl reductase (ER), and Thioesterase (TE).

FIG. 8: Construction of pCIB132 from pSUP2021.

FIG. 9: Restriction map of the clone pLSP18-6H3del3 from Pseudomonasaureofaciens carrying the phenazine biosynthetic gene region.

    __________________________________________________________________________    BRIEF DESCRIPTION OF THE SEQUENCES IN THE SEQUENCE LISTING                    __________________________________________________________________________    SEQ ID NO: 1:                                                                          Sequence of the Pyrrolnitrin Gene Cluster                            SEQ ID NO: 2:                                                                          protein sequence for ORF1 of pyrrolnitrin gene cluster               SEQ ID NO: 3:                                                                          protein sequence for ORF2 of pyrroinitrin gene cluster               SEQ ID NO: 4:                                                                          protein sequence for ORF3 of pyrrolnitrin gene cluster               SEQ ID NO: 5:                                                                          protein sequence for ORF4 of pyrrolnitrin gene cluster               SEQ ID NO: 6:                                                                          Sequence of the Soraphen Gene Cluster                                SEQ ID NO: 7:                                                                          Sequence of a Plant Consensus Translation Initiator (Clontech)       SEQ ID NO: 8:                                                                          Sequence of a Plant Consensus Translation Initiator (Joshi)          SEQ ID NO: 9:                                                                          Sequence of an Oligonucleotide for Use in a Molecular Adaptor        SEQ ID NO: 10:                                                                         Sequence of an Oligonucleotide for Use in a Molecular Adaptor        SEQ ID NO: 11:                                                                         Sequence of an Oligonucleotide for Use in a Molecular Adaptor        SEQ ID NO: 12:                                                                         Sequence of an Oligonucleotide for Use in a Molecular Adaptor        SEQ ID NO: 13:                                                                         Sequence of an Oligonucleotide for Use in a Molecular Adaptor        SEQ ID NO: 14:                                                                         Sequence of an Oligonucleotide for Use in a Molecular Adaptor        SEQ ID NO: 15:                                                                         oligonucleotide used to change restriction site                      SEQ ID NO: 16:                                                                         oligonucleotide used to change restriction site                      SEQ ID NO: 17:                                                                         Sequence of the Phenazine Gene Cluster                               SEQ ID NO: 18:                                                                         protein sequence for phz1 from the phenazine gene cluster            SEQ ID NO: 19:                                                                         protein sequence for phz2 from the phenazine gene cluster            SEQ ID NO: 20:                                                                         protein sequence for phz3 from the phenazine gene cluster            SEQ ID NO: 21:                                                                         DNA sequence for phz4 of Phenazine gene cluster                      SEQ ID NO: 22:                                                                         protein sequence for phz4 from the phenazine gene                    __________________________________________________________________________             cluster                                                          

DETAILED DESCRIPTION OF THE INVENTION Production of AntipathogenicSubstances by Microorganisms

Many organisms produce secondary metabolites and some of these inhibitthe growth of other organisms. Since the discovery of penicillin, alarge number of compounds with antibiotic activity have been identified,and the number continues to increase with ongoing screening efforts.Antibiotically active metabolites comprise a broad range of chemicalstructures. The most important include: aminoglycosides (e.g.streptomycin) and other carbohydrate containing antibiotics, peptideantibiotics (e.g. β-lactAPS, rhizocticin (see Rapp, C. et al., LiebigsAnn. Chem.: 655-661 (1988)), nucleoside derivatives (e.g. blasticidin S)and other heterocyclic antibiotics containing nitrogen (e.g. phenazineand pyrrolnitrin) and/or oxygen, polyketides (e.g. soraphen),macrocyclic lactones (e.g. erythromycin) and quinones (e.g.tetracycline).

Aminoglycosides and Other Carbohydrate Containing Antibiotics

The aminoglycosides are oligosaccharides consisting of anaminocyclohexanol moiety glycosidically linked to other amino sugars.Streptomycin, one of the best studied of the group, is produced byStreptomyces griseus. The biochemistry and biosynthesis of this compoundis complex (for review see Mansouri et al. in: Genetics and MolecularBiology of Industrial Microorganisms (ed: Hershberger et al.), AmericanSociety for Microbiology, Washington, D.C. pp 61-67 (1989)) and involves25 to 30 genes, 19 of which have been analyzed so far (Retzlaff et al.in: Industrial Microorganisms: Basic and Applied Molecular Genetics(ed.: Baltz et al.), American Society for Microbiology, Washington, D.C.pp 183-194 (1993)). Streptomycin, and many other aminoglycosides,inhibits protein synthesis in the target organisms.

Peptide Antibiotics

Peptide antibiotics are classifiable into two groups: (1) those whichare synthesized by enzyme systems without the participation of theribosomal apparatus, and (2) those which require theribosomally-mediated translation of an mRNA to provide the precursor ofthe antibiotic.

Non-Ribosomal Peptide Antibiotics are assembled by large,multifunctional enzymes which activate, modify, polymerize and in somecases cyclize the subunit amino acids, forming polypeptide chains. Otheracids, such as aminoadipic acid, diaminobutyric acid, diaminopropionicacid, dihydroxyamino acid, isoserine, dihydroxybenzoic acid,hydroxyisovaleric acid, (4R)-4-[(E)-2-butenyl]-4,N-dimethyl-L-threonine,and ornithine are also incorporated (Katz & Demain, BacteriologicalReview 41: 449-474 (1977); Kleinkauf & von Dohren, Annual Review ofMicrobiology 41: 259-289 (1987)). The products are not encoded by anymRNA, and ribosomes do not directly participate in their synthesis.Peptide antibiotics synthesized non-ribosomally can in turn be groupedaccording to their general structures into linear, cyclic, lactone,branched cyclopeptide, and depsipeptide categories (Kleinkauf & vonDohren, European Journal of Biochemistry 192: 1-15 (1990)). Thesedifferent groups of antibiotics are produced by the action of modifyingand cyclizing enzymes; the basic scheme of polymerization is common tothem all. Non-ribosomally synthesized peptide antibiotics are producedby both bacteria and fungi, and include edeine, linear gramicidin,tyrocidine and gramicidin S from Bacillus brevis, mycobacillin fromBacillus subtilis, polymyxin from Bacillus polymiyxa, etamycin fromStreptomyces griseus, echinomycin from Streptomyces echinatus,actinomycin from Streptomyces clavuligerus, enterochelin fromEscherichia coli, gamma-(alpha-L-aminoadipyl)-L-cysteinyl-D-valine (ACV)from Aspergillus nidulans, alamethicine from Trichoderma viride,destruxin from Metarhizium anisolpliae, enniatin from Fusariumoxysporum, and beauvericin from Beauveria bassiana. Extensive functionaland structural similarity exists between the prokaryotic and eukaryoticsystems, suggesting a common origin for both. The activities of peptideantibiotics are similarly broad, toxic effects of different peptideantibiotics in animals, plants, bacteria, and fungi are known (Hansen,Annual Review of Microbiology 47: 535-564 (1993); Katz & Demain,Bacteriological Reviews 41: 449-474 (1977); Kleinkauf & von Dohren,Annual Review of Microbiology 41: 259-289 (1987); Kleinkauf & vonDohren, European Journal of Biochemistry 192: 1-15 (1990); Kolter &Moreno, Annual Review of Microbiology 46: 141-163 (1992)).

Ribosomally-Synthesized Peptide Antibiotics are characterized by theexistence of a structural gene for the antibiotic itself, which encodesa precursor that is modified by specific enzymes to create the maturemolecule. The use of the general protein synthesis apparatus for peptideantibiotic synthesis opens up the possibility for much longer polymersto be made, although these peptide antibiotics are not necessarily verylarge. In addition to a structural gene, further genes are required forextracellular secretion and immunity, and these genes are believed to belocated close to the structural gene, in most cases probably on the sameoperon. Two major groups of peptide antibiotics made on ribosomes exist:those which contain the unusual amino acid lanthionine, and those whichdo not. Lanthionine-containing antibiotics (lantibiotics) are producedby gram-positive bacteria, including species of Lactococcus,Staphylococcus, Streptococcus, Bacillus, and Streptomyces. Linearlantibiotics (for example, nisin, subtilin, epidermin, and gallidermin),and circular lantibiotics (for example, duramycin and cinnamycin), areknown (Hansen, Annual Review of Microbiology 47: 535-564 (1993); Kolter& Moreno, Annual Review of Microbiology 46: 141-163 (1992)).Lantibiotics often contain other characteristic modified residues suchas dehydroalanine (DHA) and dehydrobutyrine (DHB), which are derivedfrom the dehydration of serine and threonine, respectively. The reactionof a thiol from cysteine with DHA yields lanthionine, and with DHByields β-methyllanthionine. Peptide antibiotics which do not containlanthionine may contain other modifications, or they may consist only ofthe ordinary amino acids used in protein synthesis.Non-lanthionine-containing peptide antibiotics are produced by bothgram-positive and gram-negative bacteria, including Lactobacillus,Lactococcus, Pediococcus, Enterococcus, and Escherichia. Antibiotics inthis category include lactacins, lactocins, sakacin A, pediocins,diplococcin, lactococcins, and microcins (Hansen, supra; Kolter &Moreno, supra).

Nucleoside Derivatives and Other Heterocyclic Antibiotics ContainingNitrogen and/or Oxygen

These compounds all contain heterocyclic rings but are otherwisestructurally diverse and, as illustrated in the following examples, havevery different biological activities.

Polyoxins and Nikkomycins are nucleoside derivatives and structurallyresemble UDP-N-acetylglucosamine, the substrate of chitin synthase. Theyhave been identified as competitive inhibitors of chitin synthase(Gooday, in: Biochemistry of Cell Walls and Membranes in Fungi (ed.:Kuhn et al.), Springer-Verlag, Berlin p. 61 (1990)). The polyoxins areproduced by Streptomyces cacaoi and the Nikkomycins are produced by S.tendae.

Phenazines are nitrogen-containing heterocyclic compounds with a commonplanar aromatic tricyclic structure. Over 50 naturally occurringphenazines have been identified, each differing in the substituentgroups on the basic ring structure. This group of compounds are foundproduced in nature exclusively by bacteria, in particular Streptomyces,Sorangium, and Pseudomonas (for review see Turner & Messenger, Advancesin Microbiol Physiology 27: 211-275 (1986)). Recently, the phenazinebiosynthetic genes of a P. aureofaciens strain has been isolated(Pierson & Thomashow MPMI 5: 330-339 (1992)). Because of their planararomatic structure, it has been proposed that phenazines may formintercalative complexes with DNA (Hollstein & van Gemert, Biochemistry10: 497 (1971)), and thereby interfere with DNA metabolism. Thephenazine myxin was shown to intercalate DNA (Hollstein & Butler,Biochemistry 11: 1345 (1972)) and the phenazine lomofungin was shown toinhibit RNA synthesis in yeast (Cannon & Jiminez, Biochemical Journal142: 457 (1974); Ruet et al., Biochemistry 14: 4651 (1975)).

Pyrrolnitrin is a phenylpyrrole derivative with strong antibioticactivity and has been shown to inhibit a broad range of fungi (Homma etal., Soil Biol. Biochem. 21: 723-728 (1989); Nishida et al., J.Antibiot., ser A, 18: 211-219 (1965)). It was originally isolated fromPseudomonas pyrrocinia (Arima et al, J. Antibiot., ser. A, 18: 201-204(1965)), and has since been isolated from several other Pseudomonasspecies and Myxococcus species (Gerth et al. J. Antibiot. 35: 1101-1103(1982)). The compound has been reported to inhibit fungal respiratoryelectron transport (Tripathi & Gottlieb, J. Bacteriol. 100: 310-318(1969)) and uncouple oxidative phosphorylation (Lambowitz & Slayman, J.Bacteriol. 112: 1020-1022 (1972)). It has also been proposed thatpyrrolnitrin causes generalized lipoprotein membrane damage (Nose &Arima, J. Antibiot., ser A, 22: 135-143 (1969); Carlone & Scannerini,Mycopahtologia et Mycologia Applicata 53: 111-123 (1974)). Pyrrolnitrinis biosynthesized from tryptophan (Chang et al. J. Antibiot. 34:555-566) and the biosynthetic genes from P. fluorescens have now beencloned (see Section C of examples).

Polyketide Synthases

Many antibiotics, in spite of the apparent structural diversity, share acommon pattern of biosynthesis. The molecules are built up from twocarbon building blocks, the β-carbon of which always carries a ketogroup, thus the name polyketide. The tremendous structural diversityderives from the different lengths of the polyketide chain and thedifferent side-chains introduced, either as part of the two carbonbuilding blocks, or after the polyketide backbone is formed. The ketogroups may also be reduced to hydroxyls or removed altogether. Eachround of two carbon addition is carried out by a complex of enzymescalled the polyketide synthases (PKS) in a manner similar to fatty acidbiosynthesis. The biosynthetic genes for an increasing number ofpolyketide antibiotics have been isolated and sequenced. It is quiteapparent that the PKS genes are structurally conserved. The encodedproteins generally fall into two types: type I proteins arepolyfunctional, with several catalytic domains carrying out differentenzymatic steps covalently linked together (e.g. PKS for erythromycin,soraphen, and avermectin (Joaua et al. Plasmid 28: 157-165 (1992);MacNeil et al. in: Industrial Microorganisms: Basic and AppliedMolecular Genetics, (ed.: Baltz et al.), American Society forMicrobiology, Washington D.C. pp. 245-256 (1993)); whereas type IIproteins are monofunctional (Hutchinson et al. in: IndustrialMicroorganisms: Basic and Applied Molecular Genetics, (ed.: Baltz etal.), American Society for Microbiology, Washington D.C. pp. 203-216(1993)). For the simpler polyketide antibiotics such as actinorhodin(produced by Streptomyces coelicolor), the several rounds of two carbonadditions are carried out iteratively on PKS enzymes encoded by one setof PKS genes. In contrast, synthesis of the more complicated compoundssuch as erythromycin and soraphen (see Section E of examples) involvessets of PKS genes organized into modules, with each module carrying outone round of two carbon addition (for review see Hopwood et al. in:Industrial Microorganisms: Basic and Applied Molecular Genetics, (ed.:Baltz et al.), American Society for Microbiology, Washington D.C.. pp.267-275 (1993)).

Macrocyclic Lactones

This group of compounds shares the presence of a large lactone ring withvarious ring substituents. They can be further classified intosubgroups, depending on the ring size and other characteristics. Themacrolides, for example, contain 12-, 14-, 16-, or 17-membered lactonerings glycosidically linked to one or more aminosugars and/ordeoxysugars. They are inhibitors of protein synthesis, and areparticularly effective against gram-positive bacteria. Erythromycin A, awell-studied macrolite produced by Saccharopolyspora erythraea, consistsof a 14-membered lactone ring linked to two deoxy sugars. Many of thebiosynthetic genes have been cloned; all have been located within a 60kb segment of the S. erythraea chromosome. At least 22 closely linkedopen reading frames have been identified to be likely involved inerythromycin biosynthesis (Donadio et al., in: IndustrialMicroorganisms: Basic and Applied Molecular Genetics, (ed.: Baltz etal.), American Society for Microbiology, Washington D.C.. pp 257-265(1993)).

Quinones

Quinones are aromatic compounds with two carbonyl groups on a fullyunsaturated ring. The compounds can be broadly classified into subgroupsaccording to the number of aromatic rings present, i.e., benzoquinones,napthoquinones, etc. A well studied group is the tetracyclines, whichcontain a napthacene ring with different substituents. Tetracyclines areprotein synthesis inhibitors and are effective against bothgram-positive and gram-negative bacteria, as well as rickettsias,mycoplasma, and spirochetes. The aromatic rings in the tetracyclines arederived from polyketide molecules. Genes involved in the biosynthesis ofoxytetracycline (produced by Streptomyces rimosus) have been cloned andexpressed in Streptomyces lividans (Binnie et al. J. Bacteriol. 171:887-895 (1989)). The PKS genes share homology with those foractinorhodin and therefore encode type II (monofunctional) PKS proteins(Hopewood & Sherman, Ann. Rev. Genet. 24: 37-66 (1990)).

Other Types of APS

Several other types of APSs have been identified. One of these is theantibiotic 2-hexyl-5-propyl-resorcinol which is produced by certainstrains of Pseudomonas. It was first isolated from the Pseudomonasstrain B-9004 (Kanda et al. J. Antibiot. 28: 935-942 (1975)) and is adialkyl-substituted derivative of 1,3-dihydroxybenzene. It has beenshown to have antipathogenic activity against Gram-positive bacteria (inparticular Clavibacter sp.), mycobacteria, and fungi.

Another type of APS are the methoxyacrylates, such as strobilurin B.Strobilurin B is produced by Basidiomycetes and has a broad spectrum offungicidal activity (Anke, T. et al., Journal of Antibiotics (Tokyo) 30:806-810 (1977). In particular, strobilurin B is produced by the fungusBolinia lutea. Strobilurin B appears to have antifungal activity as aresult of its ability to inhibit cytochrome b dependent electrontransport thereby inhibiting respiration (Becker, W. et al., FEBSLetters 132: 329-333 (1981).

Most antibiotics have been isolated from bacteria, actinomycetes, andfungi. Their role in the biology of the host organism is often unknown,but many have been used with great success, both in medicine andagriculture, for the control of microbial pathogens. Antibiotics whichhave been used in agriculture are: blasticidin S and kasugamycin for thecontrol of rice blast (Pyricularia oryzae), validamycin for the controlof Rhizoctonia solani, prumycin for the control of Botrytis andSclerotinia species, and mildiomycin for the control of mildew.

To date, the use of antibiotics in plant protection has involved theproduction of the compounds through chemical synthesis or fermentationand application to seeds, plant pans, or soil. This invention describesthe identification and isolation of the biosynthetic genes of a numberof anti-phytopathogenic substances and further describes the use ofthese genes to create transgenic plants with enhanced disease resistancecharacteristics and also the creation of improved biocontrol strains byexpression of the isolated genes in organisms which colonize host plantsor the rhizosphere. Furthermore, the availability of such genes providesmethods for the production of APSs for isolation and application inantipathogenic formulations.

Methods for Cloning Genes for Antipathogenic Substances

Genes encoding antibiotic biosynthetic genes can be cloned using avariety of techniques according to the invention. The simplest procedurefor the cloning of APS genes requires the cloning of genomic DNA from anorganism identified as producing an APS, and the transfer of the clonedDNA on a suitable plasmid or vector to a host organism which does notproduce the APS, followed by the identification of transformed hostcolonies to which the APS-producing ability has been conferred. Using atechnique such as λ::Tn5 transposon mutagenesis (de Bruijn & Lupski,Gene 27: 131-149 (1984)), the exact region of the transformingAPS-conferring DNA can be more precisely defined. Alternatively oradditionally, the transforming APS-conferring DNA can be cleaved intosmaller fragments and the smallest which maintains the APS-conferringability further characterized. Whereas the host organism lacking theability to produce the APS may be a different species to the organismfrom which the APS derives, a variation of this technique involves thetransformation of host DNA into the same host which has had itsAPS-producing ability disrupted by mutagenesis. In this method, anAPS-producing organism is mutated and non-APS producing mutantsisolated, and these are complemented by cloned genomic DNA from the APSproducing parent strain. A further example of a standard technique usedto clone genes required for APS biosynthesis is the use of transposonmutagenesis to generate mutants of an APS-producing organism which,after mutagenesis, fail to produce the APS. Thus, the region of the hostgenome responsible for APS production is tagged by the transposon andcan be easily recovered and used as a probe to isolate the native genesfrom the parent strain. APS biosynthetic genes which are required forthe synthesis of APSs and which are similar to known APS compounds maybe clonable by virtue of their sequence homology to the biosyntheticgenes of the known compounds. Techniques suitable for cloning byhomology include standard library screening by DNA hybridization.

This invention also describes a novel technique for the isolation of APSbiosynthetic genes which may be used to clone the genes for any APS, andis particularly useful for the cloning of APS biosynthetic genes whichmay be recalcitrant to cloning using any of the above techniques. Onereason why such recalcitrance to cloning may exist is that the standardtechniques described above (except for cloning by homology) maypreferentially lead to the isolation of regulators of APS biosynthesis.Once such a regulator has been identified, however, it can be used usingthis novel method to isolate the biosynthetic genes under the control ofthe cloned regulator. In this method, a library of transposon insertionroutants is created in a strain of microorganism which lacks theregulator or has had the regulator gene disabled by conventional genedisruption techniques. The insertion transposon used carries apromoter-less reporter gene (e.g. lacZ).

Once the insertion library has been made, a functional copy of theregulator gene is transferred to the library of cells (e.g. byconjugation or electroporation) and the plated cells are selected forexpression of the reporter gene. Cells are assayed before and aftertransfer of the regulator gene. Colonies which express the reporter geneonly in the presence of the regulator gene are insertions adjacent tothe promoter of genes regulated by the regulator. Assuming the regulatoris specific in its regulation for APS-biosynthetic genes, then the genestagged by this procedure will be APS-biosynthetic genes. In a preferredembodiment, the cloned regulator gene is the gafA gene described in PCTapplication WO 94/01561 which regulates the expression of thebiosynthetic genes for pyrrolnitrin. Thus, this method is a preferredmethod for the cloning of the biosynthetic genes for pyrrolnitrin.

In order for the cloned APS genes to be of use in transgenic expression,it is important that all the genes required for synthesis from aparticular metabolite be identified and cloned. Using combinations of,or all the techniques described above, this is possible for any knownAPS. As most APS biosynthetic genes are clustered together inmicroorganisms, usually encoded by a single operon, the identificationof all the genes will be possible from the identification of a singlelocus in an APS-producing microorganism. In addition, as regulators ofAPS biosynthetic genes are believed to regulate the whole pathway, thenthe cloning of the biosynthetic genes via their regulators is aparticularly attractive method of cloning these genes. In many cases theregulator will control transcription of the single entire operon, thusfacilitating the cloning of genes using this strategy.

Using the methods described in this application, biosynthetic genes forany APS can be cloned from a microorganism, and using the methods ofgene manipulation and transgenic plant production describe in thisspecification, the cloned APS biosynthetic genes can be modified andexpressed in transgenic plants. Suitable APS biosynthetic genes includethose described at the beginning of this section, viz. aminoglycosidesand other carbohydrate containing antibiotics (e.g. streptomycin),peptide antibiotics (both non-ribosomally and ribosomally synthesizedtypes), nucleoside derivatives and other heterocyclic antibioticscontaining nitrogen and/or oxygen (e.g. polyoxins, nikkomycins,phenazines, and pyrrolnitrin), polyketides, macrocyclic lactones andquinones (e.g. soraphen, erythromycin and tetracycline). Expression intransgenic plants will be under the control of an appropriate promoterand involves appropriate cellular targeting considering the likelyprecursors required for the particular APS under consideration. Whereasthe invention is intended to include the expression in transgenic plantsof any APS gene isolatable by the procedures described in thisspecification, those which are particularly preferred includepyrrolnitrin, soraphen, phenazine, and the peptide antibioticsgramicidin and epidermin. The cloned biosynthetic genes can also beexpressed in soil-borne or plant colonizing organisms for the purpose ofconferring and enhancing biocontrol efficacy in these organisms.Particularly preferred APS genes for this purpose are those which encodepyrrolnitrin, soraphen, phenazine, and the peptide antibiotics.

Production of Antipathogenic Substances in Heterologous Microbial Hosts

Cloned APS genes can be expressed in heterologous bacterial or fungalhosts to enable the production of the APS with greater efficiency thanmight be possible from native hosts. Techniques for these geneticmanipulations are specific for the different available hosts and areknown in the art. For example, the expression vectors pKK223-3 andpKK223-2 can be used to express heterologous genes in E. coli, either intranscriptional or translational fusion, behind the tac or trc promoter.For the expression of operons encoding multiple ORFs, the simplestprocedure is to insert the operon into a vector such as pKK223-3 intranscriptional fusion, allowing the cognate ribosome binding site ofthe heterologous genes to be used. Techniques for overexpression ingram-positive species such as Bacillus are also known in the art and canbe used in the context of this invention (Quax et al. In.: IndustrialMicroorganisms: Basic and Applied Molecular Genetics, Eds. Baltz et al.,American Society for Microbiology, Washington (1993)). Alternate systemsfor overexpression rely on yeast vectors and include the use of Pichia,Saccharomyces and Kluyveromyces (Sreekrishna, In: Industrialmicroorganisms: basic and applied molecular genetics, Baltz, Hegeman,and Skatrud eds., American Society for Microbiology, Washington (1993);Dequin & Barre, Biotechnology 12: 173-177 (1994); van den Berg et al.,Biotechnology 8: 135-139 (1990)).

Cloned APS genes can also be expressed in heterologous bacterial andfungal hosts with the aim of increasing the efficacy of biocontrolstrains of such bacterial and fungal hosts. Microorganisms which aresuitable for the heterologous overexpression of APS genes are allmicroorganisms which are capable of colonizing plants or therhizosphere. As such they will be brought into contact withphytopathogenic fungi, bacteria and nematodes causing an inhibition oftheir growth. These include gram-negative microorganisms such asPseudomonas, Enterobacter and Serratia, the gram-positive microorganismBacillus and the fungi Trichoderma and Gliocladium. Particularlypreferred heterologous hosts are Pseudomonas fluorescens, Pseudomonasputida, Pseudomonas cepacia, Pseudomonas aureofaciens, Pseudomonasaurantiaca, Enterobacter cloacae, Serratia marscesens, Bacillussubtilis, Bacillus cereus, Trichoderma viride, Trichoderma harzianum andGliocladium virens. In preferred embodiments of the invention thebiosynthetic genes for pyrrolnitrin, soraphen, phenazine, and peptideantibiotics are transferred to the particularly preferred heterologoushosts listed above. In a particularly preferred embodiment, thebiosynthetic genes for phenazine and/or soraphen are transferred to andexpressed in Pseudomonas fluorescens strain CGA267356 (described in thepublished application EU 0 472 494) which has biocontrol utility due toits production of pyrrolnitrin (but not phenazine). In another preferredembodiment, the biosynthetic genes for pyrrolnitrin and/or soraphen aretransferred to Pseudomonas aureofaciens strain 30-84 which hasbiocontrol characteristics due to its production of phenazine.Expression in heterologous biocontrol strains requires the selection ofvectors appropriate for replication in the chosen host and a suitablechoice of promoter. Techniques are well known in the art for expressionin gram-negative and gram-positive bacteria and fungi and are describedelsewhere in this specification.

Expression of Genes for Anti-phytopathogenic Substances in Plants

The APS biosynthetic genes of this invention are expressed in transgenicplants thus causing the biosynthesis of the selected APS in thetransgenic plants. In this way transgenic plants with enhancedresistance to phytopathogenic fungi, bacteria and nematodes aregenerated. For their expression in transgenic plants, the APS genes andadjacent sequences may require modification and optimization.

Although in many cases genes from microbial organisms can be expressedin plants at high levels without modification, low expression intransgenic plants may result from APS genes having codons which are notpreferred in plants. It is known in the art that all organisms havespecific preferences for codon usage, and the APS gene codons can bechanged to conform with plant preferences, while maintaining the aminoacids encoded. Furthermore, high expression in plants is best achievedfrom coding sequences which have at least 35% GC content, and preferablymore than 45%. Microbial genes which have low GC contents may expresspoorly in plants due to the existence of ATTTA motifs which maydestabilize messages, and AATAAA motifs which may cause inappropriatepolyadenylation. In addition, potential APS biosynthetic genes can bescreened for the existence of illegitimate splice sites which may causemessage truncation. All changes required to be made within the APScoding sequence such as those described above can be made using wellknown techniques of site directed mutagenesis, PCR, and synthetic geneconstruction using the methods described in the published patentapplications EP 0 385 962 (to Monsanto), EP 0 359 472 (to Lubrizol, andWO 93/07278 (to Ciba-Geigy). The preferred APS biosynthetic genes may beunmodified genes, should these be expressed at high levels in targettransgenic plant species, or alternatively may be genes modified by theremoval of destabilization and inappropriate polyadenylation motifs andillegitimate splice sites, and further modified by the incorporation ofplant preferred codons, and further with a GC content preferred forexpression in plants. Although preferred gene sequences may beadequately expressed in both monocotyledonous and dicotyledonous plantspecies, sequences can be modified to account for the specific codonpreferences and GC content preferences of monocotyledons or dicotyledonsas these preferences have been shown to differ (Murray et al. Nucl.Acids Res. 17: 477-498 (1989)).

For efficient initiation of translation, sequences adjacent to theinitiating methionine may require modification. The sequences cognate tothe selected APS genes may initiate translation efficiently in plants,or alternatively may do so inefficiently. In the case that they do soinefficiently, they can be modified by the inclusion of sequences knownto be effective in plants. Joshi has suggested an appropriate consensusfor plants (NAR 15: 6643-6653 (1987); SEQ ID NO:8)) and Clontechsuggests a further consensus translation initiator (1993/1994 catalog,page 210; SEQ ID NO:7).

These consensuses are suitable for use with the APS biosynthetic genesof this invention. The sequences are incorporated into the APS geneconstruction, up to and including the ATG (whilst leaving the secondamino acid of the APS gene unmodified), or alternatively up to andincluding the GTC subsequent to the ATG (with the possibility ofmodifying the second amino acid of the transgene).

Expression of APS genes in transgenic plants is behind a promoter shownto be functional in plants. The choice of promoter will vary dependingon the temporal and spatial requirements for expression, and alsodepending on the target species. For the protection of plants againstfoliar pathogens, expression in leaves is preferred; for the protectionof plants against ear pathogens, expression in inflorescences (e.g.spikes, panicles, cobs etc.) is preferred; for protection of plantsagainst root pathogens, expression in roots is preferred; for protectionof seedlings against soil-borne pathogens, expression in roots and/orseedlings is preferred. In many cases, however, expression against morethan one type of phytopathogen will be sought, and thus expression inmultiple tissues will be desirable. Although many promoters fromdicotyledons have been shown to be operational in monocotyledons andvice versa, ideally dicotyledonous promoters are selected for expressionin dicotyledons, and monocotyledonous promoters for expression inmonocotyledons. However, there is no restriction to the provenance ofselected promoters; it is sufficient that they are operational indriving the expression of the APS biosynthetic genes. In some cases,expression of APSs in plants may provide protection against insectpests. Transgenic expression of the biosynthetic genes for the APSbeauvericin (isolated from Beauveria bassiana) may, for example provideprotection against insect pests of crop plants.

Preferred promoters which are expressed constitutively include the CaMV35S and 19S promoters, and promoters from genes encoding actin orubiquitin. Further preferred constitutive promoters are those from the12(4-28), CP21, CP24, CP38, and CP29 genes whose cDNAs are provided bythis invention.

The APS genes of this invention can also be expressed under theregulation of promoters which are chemically regulated. This enables theAPS to be synthesized only when the crop plants are treated with theinducing chemicals, and APS biosynthesis subsequently declines.Preferred technology for chemical induction of gene expression isdetailed in the published application EP 0 332 104 (to Ciba-Geigy) andpending application Ser. No. 08/181,271 (incorporated herein byreference). A preferred promoter for chemical induction is the tobaccoPR-1 a promoter.

A preferred category of promoters is that which is wound inducible.Numerous promoters have been described which are expressed at woundsites and also at the sites of phytopathogen infection. These aresuitable for the expression of APS genes because APS biosynthesis isturned on by phytopathogen infection and thus the APS only accumulateswhen infection occurs. Ideally, such a promoter should only be activelocally at the sites of infection, and in this way APS only accumulatesin cells which need to synthesize the APS to kill the invadingphytopathogen. Preferred promoters of this kind include those describedby Stanford et al. Mol. Gen. Genet. 215: 200-208 (1989), Xu et al. PlantMolec. Biol. 22: 573-588 (1993), Logemann et al. Plant Cell 1: 151-158(1989), Rohrmeier & Lehle, Plant Molec. Biol. 22: 783-792 (1993), Fireket al. Plant Molec. Biol. 22: 129-142 (1993), and Warner et al. Plant J.3: 191-201 (1993).

Preferred tissue specific expression patterns include green tissuespecific, root specific, stem specific, and flower specific. Promoterssuitable for expression in green tissue include many which regulategenes involved in photosynthesis and many of these have been cloned fromboth monocotyledons and dicotyledons. A preferred promoter is the maizePEPC promoter from the phosphoenol carboxylase gene (Hudspeth & Grula,Plant Molec. Biol. 12: 579-589 (1989)). A preferred promoter for rootspecific expression is that described by de Framond (FEBS 290: 103-106(1991); EP 0 452 269 to Ciba-Geigy) and a further preferredroot-specific promoter is that from the T-1 gene provided by thisinvention. A preferred stem specific promoter is that described inpatent application WO 93/07278 (to Ciba-Geigy) and which drivesexpression of the maize trpA gene.

Preferred embodiments of the invention are transgenic plants expressingAPS biosynthetic genes in a root-specific fashion. In an especiallypreferred embodiment of the invention the biosynthetic genes forpyrrolnitrin are expressed behind a root specific promoter to protecttransgenic plants against the phytopathogen Rhizoctonia. In anotherespecially preferred embodiment of the invention the biosynthetic genesfor phenazine are expressed behind a root specific promoter to protecttransgenic plants against the phytopathogen Gaeumannomyces gaminis.Further preferred embodiments are transgenic plants expressing APSbiosynthetic genes in a wound-inducible or pathogen infection-induciblemanner. For example, a further especially preferred embodiment involvesthe expression of the biosynthetic genes for soraphen behind awound-inducible or pathogen-inducible promoter for the control of foliarpathogens.

In addition to the selection of a suitable promoter, constructions forAPS expression in plants require an appropriate transcription terminatorto be attached downstream of the heterologous APS gene. Several suchterminators are available and known in the art (e.g. tm1 from CaMV, E9from rbcS). Any available terminator known to function in plants can beused in the context of this invention.

Numerous other sequences can be incorporated into expression cassettesfor APS genes. These include sequences which have been shown to enhanceexpression such as intron sequences (e.g. from Adh1 and bronze1) andviral leader sequences (e.g. from TMV, MCMV and AMV).

The overproduction of APSs in plants requires that the APS biosyntheticgene encoding the first step in the pathway will have access to thepathway substrate. For each individual APS and pathway involved, thissubstrate will likely differ, and so too may its cellular localizationin the plant. In many cases the substrate may be localized in thecytosol, whereas in other cases it may be localized in some subcellularorganelle. As much biosynthetic activity in the plant occurs in thechloroplast, often the substrate may be localized to the chloroplast andconsequently the APS biosynthetic gene products for such a pathway arebest targeted to the appropriate organelle (e.g. the chloroplast).Subcellular localization of transgene encoded enzymes can be undertakenusing techniques well known in the art. Typically, the DNA encoding thetarget peptide from a known organelle-targeted gene product ismanipulated and fused upstream of the required APS gene/s. Many suchtarget sequence are known for the chloroplast and their functioning inheterologous constructions has been shown. In a preferred embodiment ofthis invention the genes for pyrrolnitrin biosynthesis are targeted tothe chloroplast because the pathway substrate tryptophan is synthesizedin the chloroplast.

In some situations, the overexpression of APS gene may deplete thecellular availability of the substrate for a particular pathway and thismay have detrimental effects on the cell. In situations such as this itis desirable to increase the amount of substrate available by theoverexpression of genes which encode the enzymes for the biosynthesis ofthe substrate. In the case of tryptophan (the substrate for pyrrolnitrinbiosynthesis) this can be achieved by overexpressing the trpA and trpBgenes as well as anthranilate synthase subunits. Similarly,overexpression of the enzymes for chorismate biosynthesis such as DAHPsynthase will be effective in producing the precursor required forphenazine production. A further way of making more substrate availableis by the turning off of known pathways which utilize specificsubstrates (provided this can be done without detrimental side effects).In this manner, the substrate synthesized is channeled towards thebiosynthesis of the APS and not towards other compounds.

Vectors suitable for plant transformation are described elsewhere inthis specification. For Agobacterium-mediated transformation, binaryvectors or vectors carrying at least one T-DNA border sequence aresuitable, whereas for direct gene transfer any vector is suitable andlinear DNA containing only the construction of interest may bepreferred. In the case of direct gene transfer, transformation with asingle DNA species or co-transformation can be used (Schocher et al.Biotechnology 4: 1093-1096 (1986)). For both direct gene transfer andAgrobacterium-mediated transfer, transformation is usually (but notnecessarily) undertaken with a selectable marker which may provideresistance to an antibiotic (kanamycin, hygromycin or methatrexate) or aherbicide (basta). The choice of selectable marker is not, however,critical to the invention.

Synthesis of an APS in a transgenic plant will frequently require thesimultaneous overexpression of multiple genes encoding the APSbiosynthetic enzymes. This can be achieved by transforming theindividual APS biosynthetic genes into different plant linesindividually, and then crossing the resultant lines. Selection andmaintenance of lines carrying multiple genes is facilitated if each thevarious transformation constructions utilize different selectablemarkers. A line in which all the required APS biosynthetic genes havebeen pyramided will synthesize the APS, whereas other lines will not.This approach may be suitable for hybrid crops such as maize in whichthe final hybrid is necessarily a cross between two parents. Themaintenance of different inbred lines with different APS genes may alsobe advantageous in situations where a particular APS pathway may lead tomultiple APS products, each of which has a utility. By utilizingdifferent lines carrying different alternative genes for later steps inthe pathway to make a hybrid cross with lines carrying all the remainingrequired genes it is possible to generate different hybrids carryingdifferent selected APSs which may have different utilities.

Alternate methods of producing plant lines carrying multiple genesinclude the retransformation of existing lines already transformed withan APS gene or APS genes (and selection with a different marker), andalso the use of single transformation vectors which carry multiple APSgenes, each under appropriate regulatory control (i.e. promoter,terminator etc.). Given the ease of DNA construction, the manipulationof cloning vectors to carry multiple APS genes is a preferred method.

Production of Antipathogenic Substances in Heterologous Hosts

The present invention also provides methods for obtaining APSs. TheseAPSs may be effective in the inhibition of growth of microbes,particularly phytopathogenic microbes. The APSs can be produced fromorganisms in which the APS genes have been overexpressed, and suitableorganisms for this include gram-negative and gram-positive bacteria andyeast, as well as plants. For the purposes of APS production, thesignificant criteria in the choice of host organism are its ease ofmanipulation, rapidity of growth (i.e. fermentation in the case ofmicroorganisms), and its lack of susceptibility to the APS beingoverproduced. These methods of APS production have significantadvantages over the chemical synthesis technology usually used in thepreparation of APSs such as antibiotics. These advantages are thecheaper cost of production, and the ability to synthesize compounds of apreferred biological enantiomer, as opposed to the racemic mixturesinevitably generated by organic synthesis. The ability to producestereochemically appropriate compounds is particularly important formolecules with many chirally active carbon atoms. APSs produced byheterologous hosts can be used in medical (i.e. control of pathogensand/or infectious disease) as well as agricultural applications.

Formulation of Antipathogenic Compositions

The present invention further embraces the preparation of antifungalcompositions in which the active ingredient is the antibiotic substanceproduced by the recombinant biocontrol agent of the present invention oralternatively a suspension or concentrate of the microorganism. Theactive ingredient is homogeneously mixed with one or more compounds orgroups of compounds described herein. The present invention also relatesto methods of treating plants, which comprise application of the activeingredient, or antifungal compositions containing the active ingredient,to plants.

The active ingredients of the present invention are normally applied inthe form of compositions and can be applied to the crop area or plant tobe treated, simultaneously or in succession, with further compounds.These compounds can be both fertilizers or micronutrient donors or otherpreparations that influence plant growth. They can also be selectiveherbicides, insecticides, fungicides, bactericides, nematicides,mollusicides or mixtures of several of these preparations, if desiredtogether with further carriers, surfactants or application-promotingadjuvants customarily employed in the art of formulation. Suitablecarders and adjuvants can be solid or liquid and correspond to thesubstances ordinarily employed in formulation technology, e.g. naturalor regenerated mineral substances, solvents, dispersants, wettingagents, tackifiers, binders or fertilizers.

A preferred method of applying active ingredients of the presentinvention or an agrochemical composition which contains at least one ofthe active ingredients is leaf application. The number of applicationsand the rate of application depend on the intensity of infestation bythe corresponding phytopathogen (type of fungus). However, the activeingredients can also penetrate the plant through the roots via the soil(systemic action) by impregnating the locus of the plant with a liquidcomposition, or by applying the compounds in solid form to the soil,e.g. in granular form (soil application). The active ingredients mayalso be applied to seeds (coating) by impregnating the seeds either witha liquid formulation containing active ingredients, or coating them witha solid formulation. In special cases, further types of application arealso possible, for example, selective treatment of the plant stems orbuds.

The active ingredients are used in unmodified form or, preferably,together with the adjuvants conventionally employed in the art offormulation, and are therefore formulated in known manner toemulsifiable concentrates, coatable pastes, directly sprayable ordilutable solutions, dilute emulsions, wettable powders, solublepowders, dusts, granulates, and also encapsulations, for example, inpolymer substances. Like the nature of the compositions, the methods ofapplication, such as spraying, atomizing, dusting, scattering orpouring, are chosen in accordance with the intended objectives and theprevailing circumstances. Advantageous rates of application are normallyfrom 50 g to 5 kg of active ingredient (a.i.) per hectare, preferablyfrom 100 g to 2 kg a.i./ha, most preferably from 200 g to 500 g a.i./ha.

The formulations, compositions or preparations containing the activeingredients and, where appropriate, a solid or liquid adjuvant, areprepared in known manner, for example by homogeneously mixing and/orgrinding the active ingredients with extenders, for example solvents,solid carders and, where appropriate, surface-active compounds(surfactants).

Suitable solvents include aromatic hydrocarbons, preferably thefractions having 8 to 12 carbon atoms, for example, xylene mixtures orsubstituted naphthalenes, phthalates such as dibutyl phthalate ordioctyl phthalate, aliphatic hydrocarbons such as cyclohexane orparaffins, alcohols and glycols and their ethers and esters, such asethanol, ethylene glycol monomethyl or monoethyl ether, ketones such ascyclohexanone, strongly polar solvents such as N-methyl-2-pyrrolidone,dimethyl sulfoxide or dimethyl formamide, as well as epoxidizedvegetable oils such as epoxidized coconut oil or soybean oil; or water.

The solid carders used e.g. for dusts and dispersible powders, arenormally natural mineral fillers such as calcite, talcum, kaolin,montmorillonite or attapulgite. In order to improve the physicalproperties it is also possible to add highly dispersed silicic acid orhighly dispersed absorbent polymers. Suitable granulated adsorptivecarders are porous types, for example pumice, broken brick, sepiolite orbentonite; and suitable nonsorbent carders are materials such as calciteor sand. In addition, a great number of pregranulated materials ofinorganic or organic nature can be used, e.g. especially dolomite orpulverized plant residues.

Depending on the nature of the active ingredient to be used in theformulation, suitable surface-active compounds are nonionic, cationicand/or anionic surfactants having good emulsifying, dispersing andwetting properties. The term "surfactants" will also be understood ascomprising mixtures of surfactants.

Suitable anionic surfactants can be both water-soluble soaps andwater-soluble synthetic surface-active compounds.

Suitable soaps are the alkali metal salts, alkaline earth metal salts orunsubstituted or substituted ammonium salts of higher fatty acids(chains of 10 to 22 carbon atoms), for example the sodium or potassiumsalts of oleic or stearic acid, or of natural fatty acid mixtures whichcan be obtained for example from coconut oil or tallow oil. The fattyacid methyltaurin salts may also be used.

More frequently, however, so-called synthetic surfactants are used,especially fatty sulfonates, fatty sulfates, sulfonated benzimidazolederivatives or alkylarylsulfonates.

The fatty sulfonates or sulfates are usually in the form of alkali metalsalts, alkaline earth metal salts or unsubstituted or substitutedammoniums salts and have a 8 to 22 carbon alkyl radical which alsoincludes the alkyl moiety of alkyl radicals, for example, the sodium orcalcium salt of lignonsulfonic acid, of dodecylsulfate or of a mixtureof fatty alcohol sulfates obtained from natural fatty acids. Thesecompounds also comprise the salts of sulfuric acid esters and sulfonicacids of fatty alcohol/ethylene oxide adducts. The sulfonatedbenzimidazole derivatives preferably contain 2 sulfonic acid groups andone fatty acid radical containing 8 to 22 carbon atoms. Examples ofalkylarylsulfonates are the sodium, calcium or triethanolamine salts ofdodecylbenzenesulfonic acid, dibutylnapthalenesulfonic acid, or of anaphthalenesulfonic acid/formaldehyde condensation product. Alsosuitable are corresponding phosphates, e.g. salts of the phosphoric acidester of an adduct of p-nonylphenol with 4 to 14 moles of ethyleneoxide.

Non-ionic surfactants are preferably polyglycol ether derivatives ofaliphatic or cycloaliphatic alcohols, or saturated or unsaturated fattyacids and alkylphenols, said derivatives containing 3 to 30 glycol ethergroups and 8 to 20 carbon atoms in the (aliphatic) hydrocarbon moietyand 6 to 18 carbon atoms in the alkyl moiety of the alkylphenols.

Further suitable non-ionic surfactants are the water-soluble adducts ofpolyethylene oxide with polypropylene glycol, ethylenediamine propyleneglycol and alkylpolypropylene glycol containing 1 to 10 carbon atoms inthe alkyl chain, which adducts contain 20 to 250 ethylene glycol ethergroups and 10 to 100 propylene glycol ether groups. These compoundsusually contain 1 to 5 ethylene glycol units per propylene glycol unit.

Representative examples of non-ionic surfactants renonylphenolpolyethoxyethanols, castor oil polyglycol ethers,polypropylene/polyethylene oxide adducts,tributylphenoxypolyethoxyethanol, polyethylene glycol andoctylphenoxyethoxyethanol. Fatty acid esters of polyoxyethylene sorbitanand polyoxyethylene sorbitan trioleate are also suitable non-ionicsurfactants.

Cationic surfactants are preferably quaternary ammonium salts whichhave, as N-substituent, at least one C8-C22 alkyl radical and, asfurther substituents, lower unsubstituted or halogenated alkyl, benzylor lower hydroxyalkyl radicals. The salts are preferably in the form ofhalides, methylsulfates or ethylsulfates, e.g. stearyltrimethylammoniumchloride or benzyldi(2-chloroethyl)ethylammonium bromide.

The surfactants customarily employed in the art of formulation aredescribed, for example, in "McCutcheon's Detergents and EmulsifiersAnnual," MC Publishing Corp. Ringwood, N.J., 1979, and Sisely and Wood,"Encyclopedia of Surface Active Agents," Chemical Publishing Co., Inc.New York, 1980.

The agrochemical compositions usually contain from about 0.1 to about99%, preferably about 0.1 to about 95%, and most preferably from about 3to about 90% of the active ingredient, from about 1 to about 99.9%,preferably from abut 1 to about 99%, and most preferably from about 5 toabout 95% of a solid or liquid adjuvant, and from about 0 to about 25%,preferably about 0.1 to about 25%, and most preferably from about 0.1 toabout 20% of a surfactant.

Whereas commercial products are preferably formulated as concentrates,the end user will normally employ dilute formulations.

EXAMPLES

The following examples serve as further description of the invention andmethods for practicing the invention. They are not intended as beinglimiting, rather as providing guidelines on how the invention may bepracticed.

A. Identification of Microorganisms which Produce AntipathogenicSubstances

Microorganisms can be isolated from many sources and screened for theirability to inhibit fungal or bacterial growth in vitro. Typically themicroorganisms are diluted and plated on medium onto or into whichfungal spores or mycelial fragments, or bacteria have been or are to beintroduced. Thus, zones of clearing around a newly isolated bacterialcolony are indicative of antipathogenic activity.

Example 1

Isolation of Microorganisms with Anti-Rhizoctonia Properties from Soil

A gram of soil (containing approximately 10⁶ -10⁸ bacteria) is suspendedin 10 ml sterile water. After vigorously mixing, the soil particles areallowed to settle. Appropriate dilutions are made and aliquots areplated on nutrient agar plates (or other growth medium as appropriate)to obtain 50-100 colonies per plate. Freshly cultured Rhizoctoniamycelia are fragmented by blending and suspensions of fungal fragmentsare sprayed on to the agar plates after the bacterial colonies havegrown to be just visible. Bacterial isolates with antifungal activitiescan be identified by the fungus-free zones surrounding them upon furtherincubation of the plates.

The production of bioactive metabolites by such isolates is confirmed bythe use of culture filtrates in place of live colonies in the plateassay described above. Such bioassays can also be used for monitoringthe purification of the metabolites. Purification may start with anorganic solvent extraction step and depending on whether the activeprinciple is extracted into the organic phase or left in the aqueousphase, different chromatographic steps follow. These chromatographicsteps are well known in the art. Ultimately, purity and chemicalidentity are determined using spectroscopic methods.

B. Cloning Antipathogenic Biosynthetic Genes from Microorganisms Example2

Shotgun Cloning Antipathogenic Biosynthetic Genes from their NativeSource

Related biosynthetic genes are typically located in close proximity toeach other in microorganisms and more than one open reading frame isoften encoded by a single operon. Consequently, one approach to thecloning of genes which encode enzymes in a single biosynthetic pathwayis the transfer of genome fragments from a microorganism containing saidpathway to one which does not, with subsequent screening for a phenotypeconferred by the pathway.

In the case of biosynthetic genes encoding enzymes leading to theproduction of an antipathogenic substance (APS), genomic DNA of theantipathogenic substance producing microorganism is isolated, digestedwith a restriction endonuclease such as Sau3A, size fractionated for theisolation of fragments of a selected size (the selected size depends onthe vector being used), and fragments of the selected size are clonedinto a vector (e.g. the BamHI site of a cosmid vector) for transfer toE. coli. The resulting E. coli clones are then screened for those whichare producing the antipathogenic substance. Such screens may be based onthe direct detection of the antipathogenic substance, such as abiochemical assay.

Alternatively, such screens may be based on the adverse effectassociated with the antipathogenic substance upon a target pathogen. Inthese screens, the clones producing the antipathogenic substance areselected for their ability to kill or retard the growth of the targetpathogen. Such an inhibitory activity forms the basis for standardscreening assays well known in the art, such as screening for theability to produce zones of clearing on a bacterial plate impregnatedwith the target pathogen (eg. spores where the target pathogen is afungus, cells where the target pathogen is a bacterium). Clones selectedfor their antipathogenic activity can then be further analyzed toconfirm the presence of the antipathogenic substance using the standardchemical and biochemical techniques appropriate for the particularantipathogenic substance.

Further characterization and identification of the genes encoding thebiosynthetic enzymes for the antipathogenic substance is achieved asfollows. DNA inserts from positively identified E. coli clones areisolated and further digested into smaller fragments. The smallerfragments are then recloned into vectors and reinserted into E. coliwith subsequent reassaying for the antipathogenic phenotype.Alternatively, positively identified clones can be subjected to λ::Tn5transposon mutagenesis using techniques well known in the art (e.g. deBruijn & Lupski, Gene 27: 13 1-149 (1984)). Using this method a numberof disruptive transposon insertions are introduced into the

DNA shown to confer APS production to enable a delineation of theprecise region/s of the DNA which are responsible for APS production.Subsequently, determination of the sequence of the smallest insert foundto confer antipathogenic substance production on E. coli will reveal theopen reading frames required for APS production. These open readingframes can ultimately be disrupted (see below) to confirm their role inthe biosynthesis of the antipathogenic substance.

Various host organisms such as Bacillus and yeast may be substituted forE. coli in the techniques described using suitable cloning vectors knownin the art for such host. The choice of host organism has only onelimitation; it should not be sensitive to the antipathogenic substancefor which the biosynthetic genes are being cloned.

Example 3

Cloning Biosynthetic Genes for an Antipathogenic Substance usingTransposon Mutagenesis

In many microorganisms which are known to produce antipathogenicsubstances, transposon mutagenesis is a routine technique used for thegeneration of insertion mutants. This technique has been usedsuccessfully in Pseudomonas (e.g. Lam et al., Plasmid 13: 200-204(1985)), Bacillus (e.g. Youngman et al., Proc. Natl. Acad. Sci. USA 80:2305-2309 (1983)), Staphylococcus (e.g. Pattee, J. Bacteriol. 145:479-488 (1981)), and Streptomyces (e.g. Schauer et al., J. Bacteriol.173: 5060-5067 (1991)), among others. The main requirement for thetechnique is the ability to introduce a transposon containing plasmidinto the microorganism enabling the transposon to insert itself at arandom position in the genome. A large library of insertion routants iscreated by introducing a transposon carrying plasmid into a large numberof microorganisms. Introduction of the plasmid into the microorganismcan be by any appropriate standard technique such as conjugation, directgene transfer techniques such as electroporation.

Once a transposon library has been created in the manner describedabove, the transposon insertion mutants are assayed for production ofthe APS. Mutants which do not produce the APS would be expected topredominantly occur as the result of transposon insertion into genesequences required for APS biosynthesis. These mutants are thereforeselected for further analysis.

DNA from the selected mutants which is adjacent to the transposon insertis then cloned using standard techniques. For instance, the host DNAadjacent to the transposon insert may be cloned as part of a library ofDNA made from the genomic DNA of the selected mutant. This adjacent hostDNA is then identified from the library using the transposon as a DNAprobe. Alternatively, if the transposon used contains a suitable genefor antibiotic resistance, then the insertion mutant DNA can be digestedwith a restriction endonuclease which will be predicted not to cleavewithin this gene sequence or between its sequence and the host insertionpoint, followed by cloning of the fragments thus generated into amicroorganism such as E. coil which can then be subjected to selectionusing the chosen antibiotic.

Sequencing of the DNA beyond the inserted transposon reveals theadjacent host sequences. The adjacent sequences can in turn be used as ahybridization probe to reclone the undisrupted native host DNA using anon-mutant host library. The DNA thus isolated from the non-mutant ischaracterized and used to complement the APS deficient phenotype of themutant. DNA which complements may contain either APS biosynthetic genesor genes which regulate all or part of the APS biosynthetic pathway. Tobe sure isolated sequences encode biosynthetic genes they can betransferred to a heterologous host which does not produce the APS andwhich is insensitive to the APS (such as E. coli). By transferringsmaller and smaller pieces of the isolated DNA and the sequencing of thesmallest effective piece, the APS genes can be identified.Alternatively, positively identified clones can be subjected to λ::Tn5transposon mutagenesis using techniques well known in the art (e.g. deBruijn & Lupski, Gene 27: 13 1-149 (1984)). Using this method a numberof disruptive transoposon insertions are introduced into the DNA shownto confer APS production to enable a delineation of the precise region/sof the DNA which are responsible for APS production. These latter stepsare undertaken in a manner analagous to that described in example 1. Inorder to avoid the possibility of the cloned genes not being expressedin the heterologous host due to the non-functioning of theirheterologous promoter, the cloned genes can be transferred to anexpression vector where they will be fused to a promoter known tofunction in the heterologous host. In the case of E. coli an example ofa suitable expression vector is pKK223 which utilizes the tac promoter.Similar suitable expression vectors also exist for other hosts such asyeast and are well known in the art. In general such fusions will beeasy to undertake because of the operon-type organization of relatedgenes in microorganisms and the likelihood that the biosynthetic enzymesrequired for APS biosynthesis will be encoded on a single transcriptrequiring only a single promoter fusion.

Example 4

Cloning Antipathogenic Biosynthetic Genes using Mutagenesis andComplementation

A similar method to that described above involves the use ofnon-insertion mutagenesis techniques (such as chemical mutagenesis andradiation mutagenesis) together with complementation. The APS producingmicroorganism is subjected to non-insertion mutagenesis and mutantswhich lose the ability to produce the APS are selected for furtheranalysis. A gene library is prepared from the parent APS-producingstrain. One suitable approach would be the ligation of fragments of20-30 kb into a vector such as pVK100 (Knauf et al. Plasmid 8: 45-54(1982)) into E. coli harboring the tra+ plasmid pRK2013 which wouldenable the transfer by triparental conjugation back to the selectedAPS-minus mutant (Ditta et al. Proc. Natl. Acad. Sci. USA 77: 7247-7351(1980)). A further suitable approach would be the transfer back to themutant of the genes library via electroporation. In each case subsequentselection is for APS production. Selected colonies are furthercharacterized by the retransformation of APS-minus mutant with smallerfragments of the complementing DNA to identify the smallest successfullycomplementing fragment which is then subjected to sequence analysis. Aswith example 2, genes isolated by this procedure may be biosyntheticgenes or genes which regulate the entire or part of the APS biosyntheticpathway. To be sure that the isolated sequences encode biosynthetic genethey can be transferred to a heterologous host which does not producethe APS and is insensitive to the APS (such as E. coli). These lattersteps are undertaken in a manner analagous to that described in example2.

Example 5

Cloning Antipathogenic Biosynthetic Genes by Exploiting Regulators whichControl the Expression of the Biosynthetic Genes

A further approach in the cloning of APS biosynthetic genes relies onthe use of regulators which control the expression of these biosyntheticgenes. A library of transposon insertion mutants is created in a strainof microorganism which lacks the regulator or has had the regulator genedisabled by conventional gene disruption techniques. The insertiontransposon used carries a promoter-less reporter gene (e.g. lacZ). Oncethe insertion library has been made, a functional copy of the regulatorgene is transferred to the library of cells (e.g. by conjugation orelectroporation) and the plated cells are selected for expression of thereporter gene. Cells are assayed before and after transfer of theregulator gene. Colonies which express the reporter gene only in thepresence of the regulator gene are insertions adjacent to the promoterof genes regulated by the regulator. Assuming the regulator is specificin its regulation for APS-biosynthetic genes, then the genes tagged bythis procedure will be APS-biosynthetic genes. These genes can then becloned and further characterized using the techniques described inexample 2.

Example 6

Cloning Antipathogenic Biosynthetic Genes by Homology

Standard DNA techniques can be used for the cloning of novelantipathogenic biosynthetic genes by virtue of their homology to knowngenes. A DNA library of the microorganism of interest is made and thenprobed with radiolabelled DNA derived from the gene/s for APSbiosynthesis from a different organism. The newly isolated genes arecharacterized and sequences and introduced into a heterologousmicroorganism or a mutant APS-minus strain of the native microorganismsto demonstrate their conferral of APS production.

C. Cloning of Pyrrolnitrin Biosynthetic Genes from Pseudomonas

Pyrrolnitrin is an phenylpyrole compound produced by various strains ofPseudomonas fluorescens. P. fluorescens strains which producepyrrolnitrin are effective biocontrol strains against Rhizoctonia andPythium fungal pathogens (WO 94/01561). The biosynthesis of pyrrolnitrinis postulated to start from tryptophan (Chang et al. J. Antibiotics 34:555-566 (1981)).

Example 7

Use of the gafA Regulator Gene for the Isolation of PyrrolnitrinBiosynthetic Genes from Pseudomonas

The gene cluster encoding pyrrolnitrin biosynthetic enzymes was isolatedusing the basic principle described in example 5 above. The regulatorgene used in this isolation procedure was the gafA gene from Pseudomonasfluorescens and is known to be part of a two-component regulatory systemcontrolling certain biocontrol genes in Pseudomonas. The gafA gene isdescribed in detail in pending application Ser. No. 08/087,636 which ishereby incorporated by reference in its entirety and in the publishedapplication WO 94/01561. gafA is further described in Gaffney et al.(1994; MPMI 74(4): 455-463; also hereby incorporated in its entirety byreference) where it is referred to as "ORF5". The gafA gene has beenshown to regulate pyrrolnitrin biosynthesis, chitinase, gelatinase andcyanide production. Strains which lack the gafA gene or which expressthe gene at low levels (and in consequence gafA-regulated genes also atlow levels) are suitable for use in this isolation technique.

Example 8

Isolation of Pyrrolnitrin Biosynthesis Genes in Pseudomonas

The transfer of the gafA gene from MOCG 134 to closely relatednon-pyrrolnitrin producing wild-type strains of Pseudomonas fluorescensresults in the ability of these strains to produce pyrrolnitrin.(Gaffney et al., MPMI (1994)); see also Hill et al. Applied AndEnvironmental Microbiology 60 78-85 (1994)). This indicates that theseclosely related strains have the structural genes needed forpyrrolnitrin biosynthesis but are unable to produce the compound withoutactivation from the gafA gene. One such closely related strain, MOCG133,was used for the identification of the pyrrolnitrin biosynthesis genes.The transposon TnCIB116 (Lam, New Directions in Biological Control:Alternatives for Suppressing Agricultural Pests and Diseases, pp767-778, Alan R. Liss, Inc. (1990)) was used to mutagenize MOCG133. Thistransposon, a Tn5 derivative, encodes kanamycin resistance and containsa promoterless lacZ reporter gene near one end. The transposon wasintroduced into MOCG133 by conjugation, using the plasmid vector pCIB116(Lam, New Directions in Biological Control: Alternatives for SuppressingAgricultural Pests and Diseases, pp 767-778, Alan R. Liss, Inc. (1990))which can be mobilized into MOCG133, but cannot replicate in thatorganism. Most, if not all, of the kanamycin resistant transconjugantswere therefore the result of transposition of TnCIB116 into differentsites in the MOCG133 genome. When the transposon integrates into thebacterial chromosome behind an active promoter the lacZ reporter gene isactivated. Such gene activation can be monitored visually by using thesubstrate X-gal, which releases an insoluble blue product upon cleavageby the lacZ gene product. Kanamycin resistant transconjugants werecollected and arrayed on master plates which were then replica platedonto lawns of E. coli strain S17-1 (Simon et al., Bio/techonology 1:784-791 (1983)) transformed with a plasmid carrying the wide host rangeRK2 origin of replication, a gene for tetracycline selection and thegafA gene. E. coli strain S 17-1 contains chromosomally integrated tragenes for conjugal transfer of plasmids. Thus, replica plating ofinsertion transposon mutants onto a lawn of the S17-1/gafA E. coliresults in the transfer to the insertion transposon routants of thegafA-carrying plasmid and enables the activity of the lacZ gene to beassayed in the presence of the gafA regulator (expression of the hostgafA is insufficient to cause lacZ expression, and introduction of gafAon a multicopy plasmid is more effective). Insertion mutants which had a"blue" phenotype (i.e. lacZ activity) only in the presence of gafA wereidentified. In these mutants, the transposon had integrated within geneswhose expression were regulated by gafA. These mutants (with introducedgafA) were assayed for their ability to produce cyanide, chitinase, andpyrrolnitrin (as described in Gaffney et al., 1994 MPMI)--activitiesknown to be regulated by gafA (Gaffney et al., 1994 MPMI). One mutantdid not produce pyrrolnitrin but did produce cyanide and chitinase,indicating that the transposon had inserted in a genetic region involvedonly in pyrrolnitrin biosynthesis. DNA sequences flanking one end of thetransposon were cloned by digesting chromosomal DNA isolated from theselected insertion mutant with XhoI, ligating the fragments derived fromthis digestion into the XhoI site of pSP72 (Promega, cat. #P2191) andselecting the E. coli transformed with the products of this ligation onkanamycin. The unique XhoI site within the transposon cleaves beyond thegene for kanamycin resistance and enabled the flanking region derivedfrom the parent MOCG 133 strain to be concurrently isolated on the sameXhoI fragment.

In fact the XhoI site of the flanking sequence was found to be locatedapproximately 1 kb away from the end on the transposon. A subfragment ofthe cloned XhoI fragment derived exclusively from the ˜1 kb flankingsequence was then used to isolate the native (i.e. non-disrupted) generegion from a cosmid library of strain MOCG 134. The cosmid library wasmade from partially Sau3A digested MOCG 134 DNA, size selected forfragments of between 30 and 40 kb and cloned into the unique BamHI siteof the cosmid vector pCIB119 which is a derivative of c2XB (Bates &Swift, Gene 26: 137-146 (1983)) and pRK290 (Ditta et al. Proc. Natl.Acad. Sci. USA 77: 7247-7351 (1980)). pCIB119 is a double-cos sitecosmid vector which has the wide host range RK2 origin of replicationand can therefore replicate in Pseudomonas as well as E. coli. Severalclones were isolated from the MOCG 134 cosmid clone library using the ˜1kb flanking sequence as a hybridization probe. Of these one clone wasfound to restore pyrrolnitrin production to the transposon insertionmutant which had lost its ability to produce pyrrolnitrin. This clonehad an insertion of ˜32 kb and was designated pCIB169. E. coli DH5αcontaining pCIB169 was deposited at the Agricultural Research CultureCollection (NRRL), 1815 N. University Street, Peoria, Ill. 61604 on May20, 1994, and assigned accession number NRRL B -21256.

Example 9

Mapping and Tn5 Mutagenesis of pCIB169

The 32 kb insert of clone pCIB169 was subcloned into pCIB189 in E. coliHB101, a derivative of pBR322 which contains a unique NotI cloning site.A convenient NotI site within the 32 kb insert as well as the presenceof NotI sites flanking the BamHI cloning site of the parent cosmidvector pCIB119 allowed the subcloning of fragments of 14 and 18 kb intopCIB189. These clones were both mapped by restriction digestion and FIG.1 shows the result of this. λTn5 transposon mutagenesis was carried outon both the 14 and 18 kb subclones using techniques well known in theart (e.g. de Bruijn & Lupski, Gene 27: 131-149 (1984). λTn5 phageconferring kanamycin resistance was used to transfect both the 14 andthe 18 kb subclones described above. λTn5 transfections were done at amultiplicity of infection of 0.1 with subsequent selection on kanamycin.Following mutagenesis plasmid DNA was prepared and retransformed into E.coli HB101 with kanamycin selection to enable the isolation of plasmidclones carrying Tn5 insertions. A total of 30 independent Tn5 insertionswere mapped along the length of the 32 kb insert (see FIG. 2). Each ofthese insertions was crossed into MOCG 134 via double homologousrecombination and verified by Southern hybridization using the Tn5sequence and the pCIB189 vector as hybridization probes to demonstratethe occurrence of double homologous recombination i.e. the replacementof the wild-type MOCG 134 gene with the Tn5-insertion gene. Pyrrolnitrinassays were performed on each of the insertions that were crossed intoMOCG 134 and a genetic region of approximately 6 kb was identified to beinvolved in pyrrolnitrin production (see FIGS. 3 and 5). This region wasfound to be centrally located in pCIB169 and was easily subcloned as anXbaI/NotI fragment into pBluescript II KS (Promega). The XbaI/NotIsubclone was designated pPRN5.9X/N (see FIG. 4).

Example 10

Identification of Open Reading Frames in the Cloned Genetic Region

The genetic region involved in pyrrolnitrin production was subclonedinto six fragments for sequencing in the vector pBluescript II KS (seeFIG. 4). These fragments spanned the ˜6 kb XbaI/NotI fragment describedabove and extended from the EcoRI site on the left side of FIG. 4 to therightmost HindIII site (see FIG. 4). The sequence of the inserts ofclones pPRN1.77E, pPRN1.01E, pPRN1.24E, pPRN2.18E, pPRN0.8H/N, andpPRN2.7H was determined using the Taq DyeDeoxy Terminator CycleSequencing Kit supplied by Applied Biosystems, Inc., Foster City, Calif.following the protocol supplied by the manufacturer. Sequencingreactions were run on a Applied Biosystems 373A Automated DNA Sequencerand the raw DNA sequence was assembled and edited using the "INHERIT"software package also from Applied Biosystems, Inc. A contiguous DNAsequence of 9.7 kb was obtained corresponding to the EcoRI/HindIIIfragment of FIG. 3 and bounded by EcoRI site #2 and HindIII site #2depicted in FIG. 4.

DNA sequence analysis was performed on the contiguous 9.7 kb sequenceusing the GCG software package from Genetics Computer Group, Inc.Madison, Wis. The pattern recognition program "FRAMES" was used tosearch for open reading frames (ORFs) in all six translation frames ofthe DNA sequence. Four open reading frames were identified using thisprogram and the codon frequency table from ORF2 of the gafA gene regionwhich was previously published (WO 94/05793; FIG. 5). These ORFs lieentirely within the ˜6 kb Xba I/NotI fragment referred to in example 9(FIG. 4) and are contained within the sequence disclosed as SEQ ID NO:1.By comparing the codon frequency usage table from MOCG134 DNA sequenceof the gafA region to these four open reading frames, very few rarecodons were used indicating that codon usage was similar in both ofthese gene regions. This strongly suggested that the four open readingframes were real. At a 3' position to the fourth reading frame numerousρ-independent stem loop structures were found suggesting a region wheretranscription could be stopped. It was thus apparent that all four ORFswere translated from a single transcript. Sequence data obtained for theregions beyond the four identified ORFs revealed a fitch open readingframe which was subsequently determined to not be involved inpyrrolnitrin synthesis based on E. coli expression studies.

Example 11

Expression of Pyrrolnitrin Biosynthetic Genes in E. coli

To determine if only four genes were needed for pyrrolnitrin production,these genes were transferred into E. coli which was then assayed forpyrrolnitrin production. The expression vector pKK223-3 was used toover-express the cloned operon in E. coli. (Brosius & Holy, Proc. Natl.Acad. Sci. USA 81: 6929 (1984)). pKK223-3 contains a strong tac promoterwhich, in the appropriate host, is regulated by the lac repressor andinduced by the addition of isopropyl-β-D-thiogalactoside (IPTG) to thebacterial growth medium. This vector was modified by the addition offurther useful restriction sites to the existing multiple cloning siteto facilitate the cloning of the ˜6 kb XbaI/NotI fragment (see example 7and FIG. 4) and a 10 kb XbaI/KpnI fragment (see FIG. 4) for expressionstudies. In each case the cloned fragment was under the control of theE. coli tac promoter (with IPTG induction), but was cloned in atranscriptional fusion so that the ribosome binding site used would bethat derived from Pseudomonas. Each of these clones was transformed intoE. coli XL1-blue host cells and induced with 2.5 mM IPTG before beingassayed for pyrrolnitrin by thin layer chromatography. Cultures weregrown for 24 h after IPTG induction in 10 ml L broth at 37° C. withrapid shaking, then extracted with an equal volume of ethyl acetate. Theorganic phase was recovered, allowed to evaporated under vacuum and theresidue dissolved in 20 μl of methanol. Silica gel thin layerchromatography (TLC) plates were spotted with 10 μl of extract and runwith toluene as the mobile phase. The plates were allowed to dry andsprayed with van Urk's reagent to visualize. Urk's reagent comprises 1 gp-Dimethylaminobenzaldehyde in 50 ml 36% HCL and 50 ml 95% ethanol.Under these conditions pyrrolnitrin appears as a purple spot on the TLCplate. This assay confirmed the presence of pyrrolnitrin in both of theexpression constructs. HPLC and mass spectrometry analysis furtherconfirmed the presence of pyrrolnitrin in both of the extracts. HPLCanalysis can be undertaken directly after redissolving in methanol (inthis case the sample is redissolved in 55% methanol) using a HewlettPackard Hypersil ODS column (5 μM) of dimensions 100×2.1 min.Pyrrolnitrin elutes after about 14 min.

Example 12

Construction of Pyrrolnitrin Gene Deletion Mutants

To further demonstrate the involvement of the 4 ORFs in pyrrolnitrinbiosynthesis, independent deletions were created in each ORF andtransferred back into Pseudomonas fluorescens strain MOCG134 byhomologous recombination. The plasmids used to generate deletions aredepicted in FIG. 4 and the positions of the deletions are shown in FIG.6. Each ORF is identified within the sequence disclosed as SEQ ID NO:1.

ORF1 (SEQ ID NO:2)

The plasmid pPRN1.77E was digested with Mlu1 to liberate a 78 bpfragment internally from ORF1. The remaining 4.66 kb vector-containingfragment was recovered, religated with T4 DNA ligase, and transformedinto the E. coli host strain DH5α. This new plasmid was linearized withMlu1 and the Klenow large fragment of DNA polymerase I was used tocreate blunt ends (Maniatis et al. Molecular Cloning, Cold Spring HarborLaboroatory (1982)). The neomycin phosphotransferase II (NPTII) genecassette from pUC4K (Pharmacia) was ligated into the plasmid by bluntend ligation and the new construct, designated pBS(ORF1Δ), wastransformed into DH5α. The construct contained a 78 bp deletion of ORF1at which position the NPTII gene conferring kanamycin resistance hadbeen inserted. The insert of this plasmid (i.e. ORF1 with NPTIIinsertion) was then excised from the pBluescript II KS vector withEcoRI, ligated into the EcoRI site of the vector pBR322 and transformedinto the E. coli host strain HB101. The new plasmid was verified byrestriction enzyme digestion and designated pBR322(ORF1Δ).

ORF2 (SEQ ID NO:3)

The plasmids pPRN1.24E and pPRN1.01E containing contiguous EcoRIfragments spanning ORF2 were double digested with EcoRI and XhoI. The1.09 kb fragment from pPRN1.24E and the 0.69 Kb fragment from pPRN1.01Ewere recovered and ligated together into the EcoRI site of pBR322. Theresulting plasmid was transformed into the host strain DH5α and theconstruct was verified by restriction enzyme digestion andelectrophoresis. The plasmid was then linearized with XhoI, the NPTIIgene cassette from pUC4K was inserted, and the new construct, designatedpBR(ORF2Δ), was transformed into HB101. The construct was verified byrestriction digestions and agarose gel electrophoresis and containsNPTII within a 472 bp deletion of the ORF2 gene.

ORF3 (SEQ 1D NO:4)

The plasmid pPRN2.56Sph was digested with PstI to liberate a 350 bpfragment. The remaining 2.22 kb vector-containing fragment was recoveredand the NPTII gene cassette from pUC4K was ligated into the PstI site.This intermediate plasmid, designated pUC(ORF3Δ), was transformed intoDH5α and verified by restriction digestion and agarose gelelectrophoresis. The gene deletion construct was excised from pUC withSphI and ligated into the SphI site of pBR322. The new plasmid,designated pBR(ORF5Δ), was verified by restriction enzyme digestion andagarose gel electrophoresis. This plasmid contains the NPTII gene withina 350 bp deletion of the ORF3 gene.

ORF4 (SEQ ID NO:5)

The plasmid pPRN2.18E/N was digested with AatII to liberate 156 bpfragment. The remaining 2.0 kb vector-containing fragment was recovered,religated, transformed into DH5α, and verified by restriction enzymedigestion and electrophoresis. The new plasmid was linearized with AatIIand T4 DNA polymerase was used to create blunt ends. The NPTII genecassette was ligated into the plasmid by blunt-end ligation and the newconstruct, designated pBS(ORF4Δ), was transformed into DH5α. The insertwas excised from the pBluescript II KS vector with EcoRI, ligated intothe EcoRI site of the vector pBR322 and transformed into the E. colihost strain HB101. The identity of the new plasmid, designatedpBR(ORF4Δ), was verified by restriction enzyme digestion and agarose gelelectrophoresis. This plasmid contains the NPTII gene within a 264 bpdeletion of the ORF4 gene.

Km^(R) Control

To control for possible effects of the kanamycin resistance marker, theNPTII gene cassette from pUC4K was inserted upstream of the pyrrolnitringene region. The plasmid pPRN2.5 S (a subclone of pPRN7.2E) waslinearized with PstI and the NPTII cassette was ligated into the PstIsite. This intermediate plasmid was transformed into DH5α and verifiedby restriction digestions and agarose gel electrophoresis. The geneinsertion construct was excised from pUC with SphI and ligated into theSphI site of pBR322. The new plasmid, designated pBR(2.SSphIKm^(R)), wasverified by restriction enzyme digestion and agarose gelelectrophoresis. It contains the NPTII region inserted upstream of thepyrrolnitrin gene region.

Each of the gene deletion constructs was mobilized into MOCG134 bytriparental mating using the helper plasmid pRK2013 in E. coli HB101.Gene replacement mutants were selected by plating on Pseudomonas MinimalMedium (PMM) supplemented with 50 mg/ml kanamycin and counterselected onPMM supplemented with 30 mg/ml tetracycline. Putative perfectreplacement mutants were verified by Southern hybridization by probingEcoRI digested DNA with pPRN18Not, pBR322 and an NPTII cassette obtainedfrom pUC4K (Pharmacia 1994 catalog no. 27-4958-01). Verification ofperfect hybridization was apparent by lack of hybridization to pBR322,hybridization of pPRN18Not to an appropriately size-shifted EcoRIfragment (reflecting deletion and insertion of NPTII), hybridization ofthe NPTII probe to the shifted band, and the disappearance of a bandcorresponding a deleted fragment.

After verification, deletion mutants were tested for production ofpyrrolnitrin, 2-hexyl-5-propyl-resorcinol, cyanide, and chitinaseproduction. A deletion in any one of the ORFs abolished pyrrolnitrinproduction, but did not affect production of the other substances. Thepresence of the NPTII gene cassette in the Km^(R) control had no effecton the production of pyrolnitrin, 2-hexyl-5-propyl-resorcinol, cyanideor chitinase. These experiments demonstrated the requirement of each ofthe four ORFs for pyrrolnitrin production.

D. Cloning of Resorcinol Biosynthetic Genes from Pseudomonas

2-hexyl-5-propyl-resorcinol is a further APS produced by certain strainsof Pseudomonas. It has been shown to have antipathogenic activityagainst Gram-positive bacteria (in particular Clavibacter spp.),mycobacteria, and fungi.

Example 13

Isolation of Genes Encoding Resorcinol

Two transposon-insertion mutants have been isolated which lack theability to produce the antipathogenic substance2-hexyl-5-propyl-resorcinol which is a further substance known to beunder the global regulation of the gafA gene in Pseudomonas fluorescens(WO 94/01561 ). The insertion transposon TnCIB116 was used to generatelibraries of mutants in MOCG134 and a gafA⁻⁻ derivative of MOCG134(BL1826). The former was screened for changes in fungal inhibition invitro; the latter was screened for genes regulated by gafA afterintroduction of gafA on a plasmid (see Section C). Selected mutants werecharacterized by HPLC to assay for production of known compounds such aspyrrolnitrin and 2-hexyl-5-propyl-resorcinol. The HPLC assay enabled acomparison of the novel mutants to the wild-type parental strain. Ineach case, the HPLC peak corresponding to 2-hexyl-5-propyl-resorcinolwas missing in the mutant. The mutant derived from MOCG134 is designatedBL1846. The mutant derived from BL1826 is designated BL1911. HPLC forresorcinol follows the same procedure as for pyrrolnitrin (see example11 ) except that 100% methanol is applied to the column at 20 min toelute resorcinol.

The resorcinol biosynthetic genes can be cloned from theabove-identified mutants in the following manner. Genomic DNA isprepared from the mutants, and clones containing the transposoninsertion and adjacent Pseudomonas sequence are obtained by selectingfor kanamycin resistant clones (kanamycin resistance is encoded by thetransposon). The cloned Pseudomonas sequence is then used as a probe toidentify the native sequences from a genomic library of P. fluorescensMOCG134. The cloned native genes are likely to represent resorcinolbiosynthetic genes.

E. Cloning Soraphen Biosynthetic Genes from Sorangium

Soraphen is a polyketide antibiotic produced by the myxobacteriumSorangium cellulosum. This compound has broad antifungal activitieswhich make it useful for agricultural applications. In particular,soraphen has activity against a broad range of foliar pathogens.

Example 14

Isolation of the Soraphen Gene Cluster

Genomic DNA was isolated from Sorangium cellulosum and partiallydigested with Sau3A. Fragments of between 30 and 40 kb were sizeselected and cloned into the cosmid vector pHC79 (Hohn & Collins, Gene11: 291-298 (1980)) which had been previously digested with BamHI andtreated with alkaline phosphatase to prevent self ligation. The cosmidlibrary thus prepared was probed with a 4.6 kb fragment which containsthe graI region of Streptomyces violaceoruber strain Tu22 encoding ORFs1-4 responsible for the biosynthesis of granaticin in S. violaceoruber.Cosmid clones which hybridized to the graI probe were identified and DNAwas prepared for analysis by restriction digestion and furtherhybridization. Cosmid p98/1 was identified to contain a 1.8 kb SalIfragment which hybridized strongly to the graI region; this SalIfragment was located within a larger 6.5 kb PvuI fragment within the ˜40kb insert of p98/1. Determination of the sequence of part of the 1.8 kbSalI insert revealed homology to the acetyltransferase proteins requiredfor the synthesis of erythromycin. Restriction mapping of the cosmidp98/1 was undertaken and generated the map depicted in FIG. 7. The DNAsequence of the soraphen gene cluster is disclosed in SEQ ID NO:6. E.coli HB101 containing p98/1 was deposited at the Agricultural ResearchCulture Collection (NRRL), 1815 N. University Street, Peoria, Ill. 61604on May 20, 1994, and assigned accession number NRRl B-21255.

Example 15

Functional Analysis of the Soraphen Gene Cluster

The regions within p98/1 that encode proteins with a role in thebiosynthesis of soraphen were identified through gene disruptionexperiments. Initially, DNA fragments were derived from cosmid p98/1 byrestriction with PvuI and cloned into the unique PvuI cloning site(which is within the gene for ampicillin resistance) of the widehost-range plasmid pSUP2021 (Simon et al. in: Molecular Genetics of theBacteria-Plant Interaction (ed.: A Puhler), Springer Verlag, Berlin pp98-106 (1983)). Transformed E. coli HB101 was selected for resistance tochloramphenicol, but sensitivity to ampicillin. Selected coloniescarrying appropriate inserts were transferred to Sorangium cellulosum SJ3 by conjugation using the method described in the publishedapplication EP 0 501 921 and EP the later app. (both to Ciba-Geigy).Plasmids were transferred to E. coli ED8767 carrying the helper plasmidpUZ8 (Hedges & Mathew, Plasmid 2: 269-278 (1979)) and the donor cellswere incubated with Sorangium cellulosum SJ3 cells from a stationaryphase culture for conjugative transfer essentially as described in EP 0501 921 (example 5) and EP the later app. (example 2). Selection was onkanmycin, phleomycin and streptomycin. It has been determined that noplasmids tested thus far are capable of autonomous replication inSorangium cellulosum, but rather, integration of the entire plasmid intothe chromosome by homologous recombination occurs at a site within thecloned fragment at low frequency. These events can be selected for bythe presence of antibiotic resistance markers on the plasmid.Integration of the plasmid at a given site results in the insertion ofthe plasmid into the chromosome and the concomitant disruption of thisregion from this event. Therefore, a given phenotype of interest, i.e.soraphen production, can be assessed, and disruption of the phenotypewill indicate that the DNA region cloned into the plasmid must have arole in the determination of this phenotype.

Recombinant pSUP2021 clones with PvuI inserts of approximate size 6.5 kb(pSN105/7), 10 kb (pSN120/10), 3.8 kb (pSNI20/43-39) and 4.0 kb(pSN120/46) were selected. The map locations (in kb) of these PvuIinserts as shown in FIG. 7 are: pSN 105/7--25.0-31.7,pSN120/10--2.5-14.5, pSN120/43-39--16.1-20.0, and pSN120/46--20.0-24.0.pSN105/7 was shown by digestion with PvuI and SalI to contain the 1.8 kbfragment referred to above in example 11. Gene disruptions with the 3.8,4.0, 6.5, and 10 kb PvuI fragments all resulted in the elimination ofsoraphen production. These results indicate that all of these fragmentscontain genes or fragments of genes with a role in the production ofthis compound.

Subsequently gene disruption experiments were performed with two BglIIfragments derived from cosmid p98/1. These were of size 3.2 kb (maplocation 32.4-35.6 on FIG. 7) and 2.9 kb (map location 35.6-38.5 on FIG.7). These fragments were cloned into the BamHI site of plasmid pCIB132that was derived from pSUP2021 according to FIG. 8. The ˜5 kb NotIfragment of pSUP2021 was excised and inverted, followed by the removalof the ˜3kb BamHI fragment. Neither of these BglII fragments was able todisrupt soraphen biosynthesis when reintroduced into Sorangium using themethod described above. This indicates that the DNA of these fragmentshas no role in soraphen biosynthesis. Examination of the DNA sequenceindicates the presence of a thioesterase domain 5' to, but near theBglII site at location 32.4. In addition, there are transcription stopcodons immediately after the thioesterase domain which are likely todemarcate the end of the ORF1 coding region. As the 2.9 and 3.2 kb BglIIfragments are immediately to the right of these sequences it is likelythat there are no other genes downstream from ORF1 that are involved insoraphen biosynthesis.

Delineation of the left end of the biosynthetic region required theisolation of two other cosmid clones, pJL1 and pJL3, that overlap p98/1on the left end, but include more DNA leftwards of p98/1. These wereisolated by hybridization with the 1.3 kb BamHI fragment on the extremeleft end of p98/1 (map location 0.0-1.3) to the Sorangium cellulosumgene library. It should be noted that the BamHI site at 0.0 does notexist in the S. cellulosum chromosome but was formed as an artifact fromthe ligation of a Sau3A restriction fragment derived from the Sorangiumcellulosum genome into the BamHI cloning site of pHC79. Southernhybridization with the 1.3 kb BamHI fragment demonstrated that pJL1 andpJL3 each contain an approximately 12.5 kb BamHI fragment that containssequences common to the 1.3 kb fragment as this fragment is in factdelineated by the BamHI site at position 1.3. Gene disruptionexperiments using the 12.5 kb BamHI fragment indicated that thisfragment contains sequences that are involved in the synthesis ofsoraphen. Gene disruption using smaller EcoRV fragments derived fromthis region and also indicated the requirement of this region forsoraphen biosynthesis. For example, two EcoRV fragments of 3.4 and 1.1kb located adjacent to the distal BamHI site at the left end of the 12.5kb fragment resulted in a reduction in soraphen biosynthesis when usedin gene disruption experiments. E. coli HB101 containing pJL3 wasdeposited at the Agricultural Research Culture Collection (NRRL), 1815N. University Street, Peoria, Ill. 61604 on May 20, 1994, and assignedaccession number NRRL B-21254.

Example 16

Sequence Analysis of the Soraphen Gene Cluster

The DNA sequence of the soraphen gene cluster was determined from thePvuI site at position 2.5 to the BglII site at position 32.4 (see FIG.7) using the Taq DyeDeoxy Terminator Cycle Sequencing Kit supplied byApplied Biosystems, Inc., Foster City, Calif. following the protocolsupplied by the manufacturer. Sequencing reactions were run on a AppliedBiosystems 373A Automated DNA Sequencer and the raw DNA sequence wasassembled and edited using the "INHERIT" software package also fromApplied Biosystems, Inc. The pattern recognition program "FRAMES" wasused to search for open reading frames (ORFs) in all six translationframes of the DNA sequence. In total approximately 30 kb of contiguousDNA was assembled and this corresponds to the region determined to becritical to soraphen biosynthesis in the disruption experimentsdescribed in example 12. This sequence encodes two ORFs which have thestructure described below.

ORF1

ORF1 is approximately 25.5 kb in size and encodes five biosyntheticmodules with homology to the modules found in the erythromycinbiosynthetic genes of Saccharopolyspora erythraea (Donadio et al.Science 252: 675-679 (1991)). Each module contains a β-ketoacylsynthase(KS), an acyltransferase (AT), a ketoreductase (KR) and an acyl carrierprotein (ACP) domain as well as β-ketone processing domains which mayinclude a dehydratase (DH) and/or enoyl reductase (ER) domain. In thebiosynthesis of the polyketide structure each module directs theincorporation of a new two carbon extender unit and the correctprocessing of the β-ketone carbon.

ORF2

In addition to ORF1, DNA sequence data from the p98/1 fragment spanningthe PvuI site at 2.5 kb and the SmaI site at 6.2 kb, indicated thepresence of a further ORF (ORF2) immediately adjacent to ORF1. The DNAsequence demonstrates the presence of a typical biosynthetic module thatappears to be encoded on an ORF whose 5' end is not yet sequenced and issome distance to the left. By comparison to other polyketidebiosynthetic gene units and the number of carbon atoms in the soraphenring structure it is likely that there should be a total of eightmodules in order to direct the synthesis of 17 carbon molecule soraphen.Since there are five modules in ORF1 described above, it was predictedthat ORF2 contains a further three and that these would extend beyondthe left end of cosmid p98/1 (position 0 in FIG. 7). This is entirelyconsistent with the gene description of example 12. The cosmid clonespJL1 and pJL3 extending beyond the left end of p98/1 presumable carrythe sequence encoding the remaining modules required for soraphenbiosynthesis.

Example 17

Soraphen: Requirement for Methylation

Synthesis of polyketides typically requires, as a first step, thecondensation of a starter unit (commonly acetate) and an extender unit(malonate) with the loss of one carbon atom in the form of CO₂ to yielda three-carbon chain. All subsequent additions result in the addition oftwo carbon units to the polyketide ring (Donadio et al. Science 252:675-679 (1991 )). Since soraphen has a 17-carbons ring, it is likelythat there are 8 biosynthetic modules required for its synthesis. Fivemodules are encoded in ORF1 and a sixth is present at the 3' end ofORF2. As explained above, it is likely that the remaining two modulesare also encoded by ORF2 in the regions that are in the 15 kb BamHIfragment from pJL1 and pJL3 for which the sequence has not yet beendetermined.

The polyketide modular biosynthetic apparatus present in Sorangiumcellulosum is required for the production of the compound, soraphen C,which has no antipathogenic activity. The structure of this compound isthe same as that of the antipathogenic soraphen A with the exceptionthat the O-methyl groups of soraphen A at positions 6, 7, and 14 of thering are hydroxyl groups. These are methylated by a specificmethyltransferase to form the active compound soraphen A. A similarsituation exists in the biosynthesis of erythromycin inSaccharopolyspora erythraea. The final step in the biosynthesis of thismolecule is the methylation of three hydroxl groups by amethyltransferase (Haydock et al., Mol. Gen. Genet. 230: 120-128(1991)). It is highly likely, therefore, that a similarmethyltransferase (or possibly more than one) operates in thebiosynthesis of soraphen A (soraphen C is unmethylated and soraphen B ispartially methylated). In all polyketide biosynthesis systems examinedthus far, all of the biosynthetic genes and associated methylases areclustered together (Summers et al. J Bacteriol 174: 1810-1820 (1992)).It is also probable, therefore, that a similar situation exists in thesoraphen operon and that the gene encoding the methyltransferase/srequired for the conversion of soraphen B and C to soraphen A is locatednear the ORF1 and ORF2 that encode the polyketide synthase. The resultsof the gene disruption experiments described above indicate that thisgene is not located immediately downstream from the 3' end of ORF1 andthat it is likely located upstream of ORF2 in the DNA contained in pJL1and pJL3. Thus, using standard techniques in the art, themethyltransferase gene can be cloned and sequenced.

Soraphen Determination

Sorangium cellulosum cells were cultured in a liquid growth mediumcontaining an exchange resin, XAD-5 (Rohm and Haas) (5% w/v). Thesoraphen A produced by the cells bound to the resin which was collectedby filtration through a polyester filter (Sartorius B 420-47-N) and thesoraphen was released from the resin by extraction with 50 mlisopropanol for 1 hr at 30° C. The isopropanol containing soraphen A wascollected and concentrated by drying to a volume of approximately 1 ml.Aliquots of this sample were analyzed by HPLC at 210 nm to detect andquantify the soraphen A. This assay procedure is specific for soraphen A(fully methylated); partially and non-methylated soraphen forms have adifferent R_(T) and are not measured by this procedure. This procedurewas used to assay soraphen A production after gene disruption.

F. Cloning and Characterization of Phenazine Biosynthetic Genes fromPseudomonas aureofaciens

The phenazine antibiotics are produced by a variety of Pseudomonas andStreptomyces species as secondary metabolites branching off the shikimicacid pathway. It has been postulated that two chorismic acid moleculesare condensed along with two nitrogens derived from glutamine to formthe three-ringed phenazine pathway precursorphenazine-1,6-dicarboxylate. However, there is also genetic evidencethat anthranilate is an intermediate between chorismate andphenazine-1,6-dicarboxylate (Essar et al., J. Bacteriol. 172: 853-866(1990)). In Pseudomonas aureofaciens 30-84, production of threephenazine antibiotics, phenazine-1-carboxylic acid,2-hydroxyphenazine-1-carboxylic acid, and 2-hydroxyphenazine, is themajor mode of action by which the strain protects wheat from the fungalphytopathogen Gaeumannomyces graminis var. tritici (Pierson & Thomashow,MPMI 5: 330-339 (1992)). Likewise, in Pseudomonas fluorescens 2-79,phenazine production is a major factor in the control of G. graminisvar. tritici (Thomashow & Weller, J. Bacteriol. 170: 3499-3508 (1988)).

Example 18

Isolation of the Phenazine Biosynthetic Genes

Pierson & Thomashow (supra) have previously described the cloning of acosmid which confers a phenazine biosynthesis phenotype on transposoninsertion mutants of Pseudomonas aureofaciens strain 30-84 which weredisrupted in their ability to synthesize phenazine antibiotics. A mutantlibrary of strain 30-84 was made by conjugation with E. coliS17-1(pSUP1021) and mutants unable to produce phenazine antibiotics wereselected. Selected mutants were unable to produce phenazine carboxylicacid, 2-hydroxyphenaxine or 2-hydroxy-phenazine carboxylic acid. Thesemutants were transformed by a cosmid genomic library of strain 30-84leading to the isolation of cosmid pLSP259 which had the ability tocomplement phenazine mutants by the synthesis of phenazine carboxylicacid, 2-hydroxyphenazine and 2-hydroxy-phenazinecarboxylic acid. pLSP259was further characterized by transposon mutagenesis using the λ::Tn5phage described by de Bruijn & Lupski (Gene 27: 131-149 (1984)). Thus asegment of approximately 2.8 kb of DNA was identified as beingresponsible for the phenazine complementing phenotype; this 2.8 kbsegment is located within a larger 9.2 kb EcoRI fragment of pLSP259.Transfer of the 9.2 kb EcoRI fragment and various deletion derivativesthereof to E. coli under the control of the lacZ promoter was undertakento assay for the production in E. coli of phenazine. The shortestdeletion derivative which was found to confer biosynthesis of all threephenazine compounds to E. coli contained an insert of approximately 6 kband was designated pLSP18-6H3del3. This plasmid contained the 2.8 kbsegment previously identified as being critical to phenazinebiosynthesis in the host 30-84 strain and was provided by Dr L S Pierson(Department of Plant Pathology, U Arizona, Tucson, Ariz.) for sequencecharacterization. Other deletion derivatives were able to cofferproduction of phenazine-carboxylic acid on E. coli, without theaccompanying production of 2-hydroxyphenazine and2-hydroxyphenazinecarboxylic acid suggesting that at least two genesmight be involved in the synthesis of phenazine and its hydroxyderivatives.

The DNA sequence comprising the genes for the biosynthesis of phenazineis disclosed in SEQ ID NO:17. Determination of the DNA sequence of theinsert of pLSP18-6H3del3 revealed the presence of four ORFs within andadjacent to the critical 2.8 kb segment. ORF1 (SEQ ID NO:18) wasdesignated phz1, ORF2 (SEQ ID NO:19) was designated phz2, and ORF3 (SEQID NO:20) was designated phz3, and ORF4 (SEQ ID NO:22) was designatedphz4. phlB is approximately 1.35 kb in size and has homology at the 5'end to the entB gene of E. coli, which encodes isochorismatase. phz2 isapproximately 1.15 kb in size and has some homology at the 3' end to thetrpG gene which encodes the beta subunit of anthranilate synthase. phz3is approximately 0.85 kb in size. phz4 is approximately 0.65 kb in sizeand is homologous to the pdxH gene of E. coli which encodes pyridoxamine5'-phosphate oxidase.

Phenazine Determination

Thomashow et al. (Appl Environ Microbiol 56: 908-912 (1990)) describe amethod for the isolation of phenazine. This involves acidifying culturesto pH 2.0 with HCl and extraction with benzene. Benzene fractions aredehydrated with Na₂ SO₄ and evaporated to dryness. The residue isredissolved in aqueous 5% NaHCO₃, reextracted with an equal volume ofbenzene, acidified, partitioned into benzene and redried. Phenazineconcentrations are determined after fractionation by reverse-phase HPLCas described by Thomashow et al. (supra).

G. Cloning Peptide Antipathogenic Genes

This group of substances is diverse and is classifiable into two groups:(1) those which are synthesized by enzyme systems without theparticipation of the ribosomal apparatus, and (2) those which requirethe ribosomally-mediated translation of an mRNA to provide the precursorof the antibiotic.

Non-Ribosomal Peptide Antibiotics

Non-Ribosomal Peptide Antibiotics are assembled by large,multifunctional enzymes which activate, modify, polymerize and in somecases cyclize the subunit amino acids, forming polypeptide chains. Otheracids, such as aminoadipic acid, diaminobutyric acid, diaminopropionicacid, dihydroxyamino acid, isoserine, dihydroxybenzoic acid,hydroxyisovaleric acid, (4R)-4-[(E)-2-butenyl]-4,N-dimethyl-L-threonine,and ornithine are also incorporated (Katz & Demain, BacteriologicalReview 41: 449-474 (1977); Kleinkauf & von Dohren, Annual Review ofMicrobiology 41: 259-289 (1987)). The products are not encoded by anymRNA, and ribosomes do not directly participate in their synthesis.Peptide antibiotics synthesized non-ribosomally can in turn be groupedaccording to their general structures into linear, cyclic, tactone,branched cyclopeptide, and depsipeptide categories (Kleinkauf & vonDohren, European Journal of Biochemistry 192: 1-15 (1990)). Thesedifferent groups of antibiotics are produced by the action of modifyingand cyclizing enzymes; the basic scheme of polymerization is common tothem all. Non-ribosomally synthesized peptide antibiotics are producedby both bacteria and fungi, and include edeine, linear gramicidin,tyrocidine and gramicidin S from Bacillus brevis, mycobacillin fromBacillus subtills, polymyxin from Bacillus polymiyxa, etamycin fromStreptomyces griseus, echinomycin from Streptomyces echinatus,actinomycin from Streptomyces clavuligerus, enterochelin fromEscherichia coli, gamma-(alpha-L-aminoadipyl)-L-cysteinyl-D-valine (ACV)from Aspergillus nidulans, alamethicine from Trichoderma viride,destruxin from Metarhizium anisolpliae, enniatin from Fusariumoxysporum, and beauvericin from Beauveria bassiana. Extensive functionaland structural similarity exists between the prokaryotic and eukaryoticsystems, suggesting a common origin for both. The activities of peptideantibiotics are similarly broad, toxic effects of different peptideantibiotics in animals, plants, bacteria, and fungi are known (Hansen,Annual Review of Microbiology 47: 535-564 (1993); Katz & Demain,Bacteriological Reviews 41: 449-474 (1977); Kleinkauf & von Dohren,Annual Review of Microbiology 41: 259-289 (1987); Kleinkauf & vonDohren, European Journal of Biochemistry 192: 1-15 (1990); Kolter &Moreno, Annual Review of Microbiology 46: 141-163 (1992)).

Amino acids are activated by the hydrolysis of ATP to form an adenylatedamino or hydroxy acid, analogous to the charging reactions carried outby aminoacyl-tRNA synthetases, and then covalent thioester intermediatesare formed between the amino acids and the enzyme(s), either at specificcysteine residues or to a thiol donated by pantetheine. The aminoacid-dependent hydrolysis of ATP is often used as an assay for peptideantibiotic enzyme complexes (Ishihara, et al., Journal of Bacteriology171: 1705-1711 (1989)). Once bound to the enzyme, activated amino acidsmay be modified before they are incorporated into the polypeptide. Themost common modifications are epimerization of L-amino (hydroxy) acidsto the D- form, N-acylations, cyclizations and N-methylations.Polymerization occurs through the participation of a pantetheinecofactor, which allows the activated subunits to be sequentially addedto the polypeptide chain. The mechanism by which the peptide is releasedfrom the enzyme complex is important in the determination of thestructural class in which the product belongs. Hydrolysis or aminolysisby a free amine of the thiolester will yield a linear (unmodified orterminally aminated) peptide such as edeine; aminolysis of thethiolester by amine groups on the peptide itself will give either cyclic(attack by terminal amine), such as gramicidin S, or branched (attack byside chain amine), such as bacitracin, peptides; lactonization with aterminal or side chain hydroxy will give a lactone, such as destruxin,branched tactone, or cyclodepsipeptide, such as beauvericin.

The enzymes which carry out these reactions are large multifunctionalproteins, having molecular weights in accord with the variety offunctions they perform. For example, gramicidin synthetases 1 and 2 are120 and 280 kDa, respectively; ACV synthetase is 230 kDa; enniatinsynthetase is 250 kDa; bacitracin synthetases 1, 2, 3 are 335, 240, and380 kDa, respectively (Katz & Demain, Bacteriological Reviews 41:449-474 (1977); Kleinkauf & von Dohren, Annual Review of Microbiology41: 259-289 (1987); Kleinkauf & von Dohren, European Journal ofBiochemistry 192: 1-15 (1990). The size and complexity of these proteinsmeans that relatively few genes must be cloned in order for thecapability for the complete nonribosomal synthesis of peptideantibiotics to be transferred. Further, the functional and structuralhomology between bacterial and eukaryotic synthetic systems indicatesthat such genes from any source of a peptide antibiotic can be clonedusing the available sequence information, current functionalinformation, and conventional microbiological techniques. The productionof a fungicidal, insecticidal, or batericidal peptide antibiotic in aplant is expected to produce an advantage with respect to the resistanceto agricultural pests.

Example 19

Cloning of Gramicidin S Biosynthesis Genes

Gramicidin S is a cyclic antibiotic peptide and has been shown toinhibit the germination of fungal spores (Murray, et al., Letters inApplied Microbiology 3: 5-7 (1986)), and may therefore be useful in theprotection of plants against fungal diseases. The gramicidin Sbiosynthesis operon (grs) from Bacillus brevis ATCC 9999 has been clonedand sequenced, including the entire coding sequences for gramicidinsynthetase 1 (GS1, grsA), another gene in the operon of unknown function(grsT), and GS2 (grsB) (Kratzschmar, et al., Journal of Bacteriology171: 5422-5429 (1989); Krause, et al., Journal of Bacteriology 162:1120-1125 (1985)). By methods well known in the art, pairs of PCRprimers are designed from the published DNA sequence which are suitablefor amplifying segments of approximately 500 base pairs from the grsoperon using isolated Bacillus brevis ATCC 9999 DNA as a template. Thefragments to be amplified are (1) at the 3' end of the coding region ofgrsB, spanning the termination codon, (2) at the 5' end of the grsBcoding sequence, including the initiation codon, (3) at the 3' end ofthe coding sequence of grsA, including the termination codon, (4) at the5' end of the coding sequence of grsA, including the initiation codon,(5) at the 3' end of the coding sequence of grsT, including thetermination codon, and (6) at the 5' end of the coding sequence of grsT,including the initiation codon. The amplified fragments areradioactively or nonradioactively labeled by methods known in the artand used to screen a genomic library of Bacillus brevis ATCC 9999 DNAconstructed in a vector such as λEMBL3. The 6 amplified fragments areused in pairs to isolate cloned fragments of genomic DNA which containintact coding sequences for the three biosynthetic genes. Clones whichhybridize to probes 1 and 2 will contain an intact grsB sequence, thosewhich hybridize to probes 3 and 4 will contain an intact grsA gene,those which hybridize to probes 5 and 6 will contain an intact grsTgene. The cloned grsA is introduced into E. coli and extracts preparedby lysing transformed bacteria through methods known in the art aretested for activity by the determination of phenylalanine-dependentATP-PP_(i) exchange (Krause, et al., Journal of Bacteriology 162:1120-1125 (1985)) after removal of proteins smaller than 120 kDa by gelfiltration chromatography. GrsB is tested similarly by assayinggel-filtered extracts from transformed bacteria for proline, valine,ornithine and leucine-dependent ATP-PP_(i) exchange.

Example 20

Cloning of Penicillin Biosynthesis Genes

A 38 kb fragment of genomic DNA from Penicillium chrysogenum transfersthe ability to synthesize penicillin to fungi, Aspergillus niger, andNeurospora crassa, which do not normally produce it (Smith, et al.,Bio/Technology 8: 39-41 (1990)). The genes which are responsible forbiosynthesis, delta-(L-alpha-aminoadipyl)-L-cysteinyl-D-valinesynthetase, isopeniciilin N synthetase, and isopenicillin Nacyltranferase have been individually cloned from P. chrysogenum andAspergillus nidulans, and their sequences determined (Ramon, et al.,Gene 57: 171-181 (1987); Smith, et al., EMBO Journal 9: 2743-2750(1990); Tobin, et al., Journal of Bacteriology 172: 5908-5914 (1990)).The cloning of these genes is accomplished by following the PCR-basedapproach described above to obtain probes of approximately 500 basepairs from genomic DNA from either Penicillium chrysogenum (for example,strain AS-P-78, from Antibioticos, S. A., Leon, Spain), or fromAspergillus nidulans for example, strain G69. Their integrity andfunction may be checked by transforming the non-producing fungi listedabove and assaying for antibiotic production and individual enzymeactivities as described (Smith, et al., Bio/Technology 8: 39-41 (1990)).

Example 21

Cloning of Bacitracin A Biosynthesis Genes

Bacitracin A is a branched cyclopeptide antibiotic which has potentialfor the enhancement of disease resistance to bacterial plant pathogens.It is produced by Bacillus licheniformis ATCC 10716, and threemultifunctional enzymes, bacitracin synthetases (BA) 1, 2, and 3, arerequired for its synthesis. The molecular weights of BA1, BA2, and BA3are 335 kDa, 240 kDa, and 380 kDa, respectively. A 32 kb fragment ofBacillus licheniformis DNA which encodes the BA2 protein and part of theBA3 protein shows that at least these two genes are linked (Ishihara, etal., Journal of Bacteriology 171: 1705-1711 (1989)). Evidence fromgramicidin S, penicillin, and surfactin biosynthetic operons suggestthat the first protein in the pathway, BA1, will be encoded by a genewhich is relatively close to BA2 and BA3. BA3 is purified by publishedmethods, and it is used to raise an antibody in rabbits (Ishihara, etal. supra). A genomic library of Bacillus licheniformis DNA istransformed into E. coli and clones which express antigenic determinantsrelated to BA3 are detected by methods known in the art. Because BA1,BA2, and BA3 are antigenically related, the detection method willprovide clones encoding each of the three enzymes. The identity of eachclone is confirmed by testing extracts of transformed E. coli for theappropriate amino acid-dependent ATP-PP_(i) exchange. Clones encodingBA1 will exhibit leucine-, glutamic acid-, and isoleucine-dependentATP-PP_(i) exchange, those encoding BA2 will exhibit lysine- andornithine-dependent exchange, and those encoding BA3 will exhibitisoleucine, phenylalanine-, histidine-, aspartic acid-, andasparagine-dependent exchange. If one or two genes are obtained by thismethod, the others are isolated by "walking" techniques known in theart.

Example 22

Cloning of Beauvericin and Destruxin Biosynthesis Genes

Beauvericin is an insecticidal hexadepsipeptide produced by the fungusBeauveria bassiana (Kleinkauf & yon Dohren, European Journal ofBiochemistry 192: 1-15 (1990)) which will provide protection to plantsfrom insect pests. It is an analog of enniatin, a phytotoxichexadepsipeptide produced by some phytopathogenic species of Fusarium(Burmeister & Plattner, Phytopathology 77: 1483-1487 (1987)). Destruxinis an insecticidal lactone peptide produced by the fungus Metarhiziumanisopliae (James, et al., Journal of Insect Physiology 39: 797-804(1993)). Monoclonal antibodies directed to the region of the enniatinsynthetase complex responsible for N-methylation of activated aminoacids cross react with the synthetases for beauvericin and destruxin,demonstrating their structural relatedness (Kleinkauf & von Dohren,European Journal of Biochemistry 192: 1-15 (1990)). The gene forenniatin synthetase gene (esyn1) from Fusarium scirpi has been clonedand sequenced (Haese, et al., Molecular Microbiology 7: 905-914 (1993)),and the sequence information is used to carry out a cloning strategy forthe beauvericin synthetase and destruxin synthetase genes as describedabove. Probes for the beauvericin synthetase (BE) gene and the destruxinsynthetase (DXS) gene are produced by amplifying specific regions ofBeauveria bassiana genomic DNA or Metarhizium anisopliae genomic DNAusing oligomers whose sequences are taken from the enniatin synthetasesequence as PCR primers. Two pairs of PCR primers are chosen, with onepair capable of causing the amplification of the segment of the BE genespanning the initiation codon, and the other pair capable of causing theamplification of the segment of the BE gene which spans the terminationcodon. Each pair will cause the production of a DNA fragment which isapproximately 500 base pairs in size. Library of genomic DNA fromBeauveria bassiana and Metarhizium anisopliae are probed with thelabeled fragments, and clones which hybridize to both of them arechosen. Complete coding sequences of beauvericin synthetase will causethe appearance of phenylalanine-dependent ATP-PP_(i) exchange in anappropriate host, and that of destruxin will cause the appearance ofvaline-, isoleucine-, and alanine-dependent ATP-PP_(i) exchange.Extracts from these transformed organisms will also carry out thecell-free biosynthesis of beauvericin and destruxin, respectively.

Example 23

Cloning genes for the Biosynthesis of an Unknown Peptide Antibiotic

The genes for any peptide antibiotic are cloned by the use of conservedregions within the coding sequence. The functions common to all peptideantibiotic synthetases, that is, amino acid activation, ATP-, andpantotheine binding, are reflected in a repeated domain structure inwhich each domain spans approximately 600 amino acids. Within thedomains, highly conserved sequences are known, and it is expected thatrelated sequences will exist in any peptide antibiotic synthetase,regardless of its source. The published DNA sequences of peptidesynthetase genes, including gramicidin synthetases 1 and 2 (Hori, etal., Journal of Biochemistry 106: 639-645 (1989); Krause, et al.,Journal of Bacteriology 162: 1120-1125 (1985); Turgay, et al., MolecularMicrobiology 6: 529-546 (1992)), tyrocidine sythethase 1 and 2(Weckermann, et al., Nucleic Acids Research 16: 11841 (1988)), ACVsynthetase (MacCabe, et al., Journal of Biological Chemistry 266:12646-12654 (1991)), enniatin synthetase (Haese, et al., MolecularMicrobiology 7: 905-914 (1993)), and surfactin synthetase (Fuma, et al.,Nucleic Acids Research 21: 93-97 (1993); Grandi, et al., EleventhInternational Spores Conference (1992)) are compared and the individualrepeated domains are identified. The domains from all the synthetasesare compared as a group, and the most highly conserved sequences areidentified. From these conserved sequences, DNA oligomers are designedwhich are suitable for hybridizing to all of the observed variants ofthe sequence, and another DNA sequence which lies, for example, from 0.1to 2 kilobases away from the first DNA sequence, is used to designanother DNA oligomer. Such pairs of DNA oligomers are used to amplify byPCR the intervening segment of the unknown gene by combining them withgenomic DNA prepared from the organism which produces the antibiotic,and following a PCR amplification procedure. The fragment of DNA whichis produced is sequenced to confirm its identity, and used as a probe toidentify clones containing larger segments of the peptide synthetasegene in a genomic library. A variation of this approach, in which theoligomers designed to hybridize to the conserved sequences in the geneswere used as hybridization probes themselves, rather than as primers ofPCR reactions, resulted in the identification of part of the surfactinsynthetase gene from Bacillus subtills ATCC 21332 (Borchert, et al.,FEMS Microbiological Letters 92: 175-180 (1992)). The cloned genomic DNAwhich hybridizes to the PCR-generated probe is sequenced, and thecomplete coding sequence is obtained by "walking" procedures. Such"walking" procedures will also yield other genes required for thepeptide antibiotic synthesis, because they are known to be clustered.

Another method of obtaining the genes which code for the synthetase(s)of a novel peptide antibiotic is by the detection of antigenicdeterminants expressed in a heterologous host after transformation withan appropriate genomic library made from DNA from theantibiotic-producing organism. It is expected that the common structuralfeatures of the synthetases will be evidenced by cross-reactions withantibodies raised against different synthetase proteins. Such antibodiesare raised against peptide synthetases purified from knownantibiotic-producing organisms by known methods (Ishihara, et al.,Journal of Bacteriology 171: 1705-1711 (1989)). Transformed organismsbearing fragments of genomic DNA from the producer of the unknownpeptide antibiotic are tested for the presence of antigenic determinantswhich are recognized by the anti-peptide synthetase antisera by methodsknown in the art. The cloned genomic DNA carried by cells which areidentified by the antisera are recovered and sequenced. "Walking"techniques, as described earlier, are used to obtain both the entirecoding sequence and other biosynthetic genes.

Another method of obtaining the genes which code for the synthetase ofan unknown peptide antibiotic is by the purification of a protein whichhas the characteristics of the appropriate peptide synthetase, anddetermining all or part of its amino acid sequence. The amino acidspresent in the antibiotic are determined by first purifying it from achloroform extract of a culture of the antibiotic-producing organism,for example by reverse phase chromatography on a C₁₈ column in anethanol-water mixture. The composition of the purified compound isdetermined by mass spectrometry, NMR, and analysis of the products ofacid hydrolysis. The amino or hydroxy acids present in the peptideantibiotic will produce ATP-PP_(i) exchange when added to apeptide-synthetase-containing extract from the antibiotic-producingorganism. This reaction is used as an assay to detect the presence ofthe peptide synthetase during the course of a protein purificationscheme, such as are known in the art. A substantially pure preparationof the peptide synthetase is used to determine its amino acid sequence,either by the direct sequencing of the intact protein to obtain theN-terminal amino acid sequence, or by the production, purification, andsequencing of peptides derived from the intact peptide synthetase by theaction of specific proteolytic enzymes, as are known in the art. A DNAsequence is inferred from the amino acid sequence of the synthetase, andDNA oligomers are designed which are capable of hybridizing to such acoding sequence. The oligomers are used to probe a genomic library madefrom the DNA of the antibiotic-producing organism. Selected clones aresequenced to identify them, and complete coding sequences and associatedgenes required for peptide biosynthesis are obtained by using "walking"techniques. Extracts from organisms which have been transformed with theentire complement of peptide biosynthetic genes, for example bacteria orfungi, will produce the peptide antibiotic when provided with therequired amino or hydroxy acids, ATP, and pantetheine.

Further methods appropriate for the cloning of genes required for thesynthesis of non-ribosomal peptide antibiotics are described in SectionB of the examples.

Ribosomally-Synthesized Peptide Antibiotics

Ribosomally-Synthesized Peptide Antibiotics are characterized by theexistence of a structural gene for the antibiotic itself, which encodesa precursor that is modified by specific enzymes to create the maturemolecule. The use of the general protein synthesis apparatus for peptideantibiotic synthesis opens up the possibility for much longer polymersto be made, although these peptide antibiotics are not necessarily verylarge. In addition to a structural gene, further genes are required forextracellular secretion and immunity, and these genes are believed to belocated close to the structural gene, in most cases probably on the sameoperon. Two major groups of peptide antibiotics made on ribosomes exist:those which contain the unusual amino acid lanthionine, and those whichdo not. Lanthionine-containing antibiotics (lantibiotics) are producedby gram-positive bacteria, including species of Lactococcus,Staphylococcus, Streptococcus, Bacillus, and Streptomyces. Linearlantibiotics (for example, nisin, subtilin, epidermin, and gallidermin),and circular lantibiotics (for example, duramycin and cinnamycin), areknown (Hansen, Annual Review of Microbiology 47: 535-564 (1993); Kolter& Moreno, Annual Review of Microbiology 46: 141-163 (1992)).Lantibiotics often contain other characteristic modified residues suchas dehydroalanine (DHA) and dehydrobutyrine (DHB), which are derivedfrom the dehydration of serine and threonine, respectively. The reactionof a thiol from cysteine with DHA yields lanthionine, and with DHByields β-methyllanthionine. Peptide antibiotics which do not containlanthionine may contain other modifications, or they may consist only ofthe ordinary amino acids used in protein synthesis.Non-lanthionine-containing peptide antibiotics are produced by bothgram-positive and gram-negative bacteria, including Lactobacillus,Lactococcus, Pediococcus, Enterococcus, and Escherichia. Antibiotics inthis category include lactacins, lactocins, sakacin A, pediocins,diplococcin, lactococcins, and microcins (Hansen, supra; Kolter &Moreno, supra). In general, peptide antibiotics whose synthesis is begunon ribosomes are subject to several types of post-translationalprocessing, including proteolytic cleavage and modification of aminoacid side chains, and require the presence of a specific transportand/or immunity mechanism. The necessity for protection from the effectsof these antibiotics appears to contrast strongly with the lack of suchsystems for nonribosomal peptide antibiotics. This may be rationalizedby considering that the antibiotic activity of manyribosomally-synthesized peptide antibiotics is directed at a narrowrange of bacteria which are fairly closely related to the producingorganism. In this situation, a particular method of distinguishing theproducer from the competitor is required, or else the advantage is lost.As antibiotics, this property has limited the usefulness of this classof molecules for situations in which a broad range of activity ifdesirable, but enhances their attractiveness in cases when a verylimited range of activities is advantageous. In eukaryotic systems,which are not known to be sensitive to any of this type of peptideantibiotic, it is not clear if production of a ribosomally-synthesizedpeptide antibiotic necessitates one of these transport systems, or iftransport out of the cell is merely a matter of placing the antibioticin a better location to encounter potential pathogens. This question canbe addressed experimentally, as shown in the examples which follow.

Example 24

Cloning Genes for the Biosynthesis of a Lantibiotic

Examination of genes linked to the structural genes for the lantibioticsnisin, subtilin, and epidermin show several open reading frames whichshare sequence homology, and the predicted amino acid sequences suggestfunctions which are necessary for the maturation and transport of theantibiotic. The spa genes of Bacillus subtills ATCC 6633, includingspaS, the structural gene encoding the precursor to subtilin, have beensequenced (Chung & Hansen, Journal of Bacteriology 174: 6699-6702(1992); Chung, et al., Journal of Bacteriology 174: 1417-1422 (1992);Klein, et al., Applied and Environmental Microbiology 58: 132-142(1992)). Open reading frames were found only upstream of spaS, at leastwithin a distance of 1-2 kilobases. Several of the open reading framesappear to part of the same transcriptional unit, spaE, spaD, spaB, andspaC, with a putative promoter upstream of spaE. Both spaB, whichencodes a protein of 599 amino acids, and spaD, which encodes a proteinof 177 amino acids, share homology to genes required for the transportof hemolysin, coding for the HylB and HlyD proteins, respectively. SpaE,which encodes a protein of 851 amino acids, is homologous to nisB, agene linked to the structural gene for nisin, for which no function isknown. SpaC codes for a protein of 442 amino acids of unknown function,but disruption of it eliminates production of subtilin. These genes arecontained on a segment of genomic DNA which is approximately 7 kilobasesin size (Chung & Hansen, Journal of Bacteriology 174: 6699-6702 (1992);Chung, et al., Journal of Bacteriology 174: 1417-1422 (1992); Klein, etal., Applied and Environmental Microbiology 58: 132-142 (1992)). It hasnot been clearly demonstrated if these genes are completely sufficientto confer the ability to produce subtilin. A 13.5 kilobasepair (kb)fragment from plasmid Tu32 of Staphylococcus epidermis Tu3298 containingthe structural gene for epidermin (epiA), also contains five openreading flames denoted epiA, epiB, epiC, epiD, epiQ, and epiP. The genesepiBC are homologous to the genes spaBC, while epiQ appears to beinvolved in the regulation of the expression of the operon, and epiP mayencode a protease which acts during the maturation of pre-epidermin toepidermin. EpiD encodes a protein of 181 amino acids which binds thecoenzyme ravin mononucleotide, and is suggested to performpost-translational modification of pre-epidermin (Kupke, et al., Journalof Bacteriology 174: (1992); Peschel, et al., Molecular Microbiology 9:31-39 (1993); Schnell, et al., European Journal of Biochemistry 204:57-68 (1992)). It is expected that many, if not all, of the genesrequired for the biosynthesis of a lantibiotic will be clustered, andphysically close together on either genomic DNA or on a plasmid, and anapproach which allows one of the necessary genes to be located will beuseful in finding and cloning the others. The structural gene for alantibiotic is cloned by designing oligonucleotide probes based on theamino acid sequence determined from a substantially purified preparationof the lantibiotic itself, as has been done with the lantibioticslacticin 481 from Lactococcus lactis subsp. lactis CNRZ 481 (Piard, etal., Journal of Biological Chemistry 268: 16361-16368 (1993)),streptococcin A-FF22 from Streptococcus pyogenes FF22 (Hynes, et al.,Applied and Environmental Microbiology 59: 1969-1971 (1993)), andsalivaricin A from Streptococcus salivarius 203P (Ross, et al., Appliedand Environmental Microbiology 59: 2014-2021 (1993)). Fragments ofbacterial DNA approximately 10-20 kilobases in size containing thestructural gene are cloned and sequenced to determine regions ofhomology to the characterized genes in the spa, epi, and nis operons.Open reading flames which have homology to any of these genes or whichlie in the same transcriptional unit as open reading frames havinghomology to any of these genes are cloned individually using techniquesknown in the art. A fragment of DNA containing all of the associatedreading flames and no others is transformed into a non-producing strainof bacteria, such as Esherichia coli, and the production of thelantibiotic analyzed, in order to demonstrate that all the requiredgenes are present.

Example 25

Cloning Genes for the Biosynthesis of a Non-Lanthionine Containing,Ribosomally Synthesized Peptide Antibiotic

The lack of the extensive modifications present in lantibiotics isexpected to reduce the number of genes required to account for thecomplete synthesis of peptide antibiotics exemplified by lactacin F,sakacin A, lactococcin A, and helveticin J. Clustered genes involved inthe biosynthesis of antibiotics were found in Lactobacillus johnsoniiVPI11088, for lactacin F (Fremaux, et al., Applied and EnvironmentalMicrobiology 59: 3906-3915 (1993)), in Lactobacillus sake Lb706 forsakacin A (Axelsson, et al., Applied and Environmental Microbiology 59:2868-2875 (1993)), in Lactococcus lactis for lactococcin A (Stoddard, etal., Applied and Environmental Microbiology 58: 1952-1961 (1992)), andin Pediococcus acidilactici for pediocin PA-1 (Marugg, et al., Appliedand Environmental Microbiology, 58: 2360-2367 (1992)). The genesrequired for the biosynthesis of a novel non-lanthionine-containingpeptide antibiotic are cloned by first determining the amino acidsequence of a substantially purified preparation of the antibiotic,designing DNA oligomers based on the amino acid sequence, and probing aDNA library constructed from either genomic or plasmid DNA from theproducing bacterium. Fragments of DNA of 5-10 kilobases which containthe structural gene for the antibiotic are cloned and sequenced. Openreading frames which have homology to sakB from Lactobacillus sake, orto lafX, ORFY, or ORFZ from Lactobacillus johnsonii, or which are partof the same transcriptional unit as the antibiotic structural gene orgenes having homology to those genes previously mentioned areindividually cloned by methods known in the art. A fragment of DNAcontaining all of the associated reading frames and no others istransformed into a non-producing strain of bacteria, such as Esherichiacoli, and the production of the antibiotic analyzed, in order todemonstrate that all the required genes are present.

H. Expression of Antibiotic Biosynthetic Genes in Microbial HostsExample 26

Overexpression of APS Biosynthetic Genes for Overproduction of APS usingFermentation-Type Technology

The APS biosynthetic genes of this invention can be expressed inheterologous organisms for the purposes of their production at greaterquantities than might be possible from their native hosts.

A suitable host for heterologous expression is E. coli and techniquesfor gene expression in E. coli are well known. For example, the clonedAPS genes can be expressed in E. coli using the expression vector pKK223as described in example 11 The cloned genes can be fused intranscriptional fusion, so as to use the available ribosome binding sitecognate to the heterologous gene. This approach facilitates theexpression of operons which encode more than one open reading frame astranslation of the individual ORFs will thus be dependent on theircognate ribosome binding site signals. Alternatively APS genes can befused to the vector's ATG (e.g. as an NcoI fusion) so as to use the E.coli ribosome binding site. For multiple ORF expression in E. coli (e.g.in the case of operons with multiple ORFs) this type of construct wouldrequire a separate promoter to be fused to each ORF. It is possible,however, to fuse the first ATG of the APS operon to the E. coli ribosomebinding site while requiting the other ORFs to utilize their cognateribosome binding sites. These types of construction for theoverexpression of genes in E. coli are well known in the art. Suitablebacterial promoters include the lac promoter, the tac (trp/lac)promoter, and the Pλ promoter from bacteriophage λ. Suitablecommercially available vectors include, for example, pKK223-3, pKK233-2,pDR540, pDR720, pYEJ001 and pPL-Lambda (from Pharmacia, Piscataway,N.J.).

Similarly, gram positive bacteria, notably Bacillus species andparticularly Bacillus licheniformis, are used in commercial scaleproduction of heterologous proteins and can be adapted to the expressionof APS biosynthetic genes (e.g. Quax et al., In: IndustrialMicroorganisms: Basic and Applied Molecular Genetics, Eds.: Baltz etal., American Society for Microbiology, Washington (1993)). Regulatorysignals from a highly expressed Bacillus genes (e.g. amylase promoter,Quax et al., supra) are used to generate transcriptional fusions withthe APS biosynthetic genes.

In some instances, high level expression of bacterial genes has beenachieved using yeast systems, such as the methylotrophic yeast Pichiapastoris (Sreekrishna, In: Industrial microorganisms: basic and appliedmolecular genetics, Baltz, Hegeman, and Skatrud eds., American Societyfor Microbiology, Washington (1993)). The APS gene(s) of interest arepositioned behind 5' regulatory sequences of the Pichia alcohol oxidasegene in vectors such as pHIL-D 1 and pHIL-D2 (Sreekrishna, supra). Suchvectors are used to transform Pichia and introduce the heterologous DNAinto the yeast genome. Likewise, the yeast Saccharomyces cerevisiae hasbeen used to express heterologous bacterial genes (e.g. Dequin & Barre,Biotechnology 12: 173-177 (1994)). The yeast Kluyveromyces lactis isalso a suitable host for heterologous gene expression (e.g. van den Berget al., Biotechnology 8: 135-139 (1990)).

Overexpression of APS genes in organisms such as E. coli, Bacillus andyeast, which are known for their rapid growth and multiplication, willenable fermentation-production of larger quantities of APSs. The choiceof organism may be restricted by the possible susceptibility of theorganism to the APS being overproduced; however, the likelysusceptibility can be determined by the procedures outlined in SectionJ. The APSs can be isolated and purified from such cultures (see "G")for use in the control of microorganisms such as fungi and bacteria.

I. Expression of Antibiotic Biosynthetic Genes in Microbial Hosts forBiocontrol Purposes

The cloned APS biosynthetic genes of this invention can be utilized toincrease the efficacy of biocontrol strains of various microorganisms.One possibility is the transfer of the genes for a particular APS backinto its native host under stronger transcriptional regulation to causethe production of larger quantities of the APS. Another possibility isthe transfer of genes to a heterologous host, causing production in theheterologous host of an APS not normally produced by that host.

Microorganisms which are suitable for the heterologous overexpression ofAPS genes are all microorganisms which are capable of colonizing plantsor the rhizosphere. As such they will be brought into contact withphytopathogenic fungi causing an inhibition of their growth. Theseinclude gram-negative microorganisms such as Pseudomona & Enterobacterand Serratia, the gram-positive microorganism Bacillus and the fungiTrichoderma and Gliocladium. Particularly preferred heterologous hostsare Pseudomonas fluorescens, Pseudomonas putida, Pseudomonas cepacia,Pseudomonas aureofaciens, Pseudomonas aurantiaca, Enterobacter cloacae,Serratia marscesens, Bacillus subtilis, Bacillus cereus, Trichodermaviride, Trichoderma harzianum and Gliocladium virens.

Example 27

Expression of APS Biosynthetic Genes in E. coil and Other Gram-NegativeBacteria

Many genes have been expressed in gram-negative bacteria in aheterologous manner. Example 11 describes the expression of genes forpyrrolnitrin biosynthesis in E. coli using the expression vectorpKK223-3 (Pharmacia catalogue #27-4935-01). This vector has a strong tacpromoter (Brosius, J. et al., Proc. Natl. Acad. Sci. USA 81) regulatedby the lac repressor and induced by IPTG. A number of other expressionsystems have been developed for use in E. coli and some are detailed inE (above). The thermoinducible expression vector PP_(L) (Pharmacia#27-4946-01) uses a tightly regulated bacteriophage λ promoter whichallows for high level expression of proteins. The lac promoter providesanother means of expression but the promoter is not expressed at suchhigh levels as the tac promoter. With the addition of broad host rangereplicons to some of these expression system vectors, production ofantifungal compounds in closely related gram negative-bacteria such asPseudomonas, Enterobacter, Serratia and Erwinia is possible. Forexample, pLRKD211 (Kaiser & Kroos, Proc. Natl. Acad. Sci. USA 81:5816-5820 (1984)) contains the broad host range replicon ori T whichallows replication in many gram-negative bacteria.

In E. coli, induction by IPTG is required for expression of the tac(i.e. trp-lac) promoter. When this same promoter (e.g. on wide-hostrange plasmid pLRKD211) is introduced into Pseudomonas it isconstitutively active without induction by 12PTG. This trp-lac promotercan be placed in front of any gene or operon of interest for expressionin Pseudomonas or any other closely related bacterium for the purposesof the constitutive expression of such a gene. If the operon of interestcontains the information for the biosynthesis of an APS, then anotherwise biocontrol-minus strain of a gram-negative bacterium may beable to protect plants against a variety of fungal diseases. Thus, genesfor antifungal compounds can therefore be placed behind a strongconstitutive promoter, transferred to a bacterium that normally does notproduce antifungal products and which has plant or rhizospherecolonizing properties turning these organisms into effective biocontrolstrains. Other possible promoters can be used for the constitutiveexpression of APS genes in gram-negative bacteria. These include, forexample, the promoter from the Pseudomonas regulatory genes gafA andlemA (WO 94/01561) and the Pseudomonas savastanoi IAA operon promoter(Gaffney et al., J. Bacteriol. 172: 5593-5601 (1990).

Example 28

Expression of APS Biosynthetic Genes in Gram-Positive Bacteria

Heterologous expression of genes encoding APS genes in gram-positivebacteria is another means of producing new biocontrol strains.Expression systems for Bacillus and Streptomyces are the bestcharacterized. The promoter for the erythromycin resistance gene (ermR)from Streptococcus pneumoniae has been shown to be active ingram-positive aerobes and anaerobes and also in E. coli (Trieu-Cuot etal., Nucl Acids Res 18: 3660 (1990)). A further antibiotic resistancepromoter from the thiostreptone gene has been used in Streptomycescloning vectors (Bibb, Mol Gen Genet 199: 26-36 (1985)). The shuttlevector pHT3101 is also appropriate for expression in Bacillus (Lereclus,FEMS Microbiol Lett 60: 211-218 (1989)). By expressing an operon (suchas the pyrrolnitrin operon) or individual APS encoding egens undercontrol of the ermR or other promoters it will be possible to convertsoil bacilli into strains able to protect plants against microbialdiseases. A significant advantage of this approach is that manygram-positive bacteria produce spores which can be used in formulationsthat produce biocontrol products with a longer shelf life. Bacillus andStreptomyces species are aggressive colonizers of soils. In fact bothproduce secondary metabolites including antibiotics active against abroad range of organisms and the addition of heterologous antifungalgenes including (including those encoding pyrrolnitrin, soraphen,phenazine or cyclic peptides) to gram-positive bacteria may make theseorganisms even better biocontrol strains.

Example 29

Expression of APS Biosynthetic Genes in Fungi

Trichoderma harzianum and Gliocladium virens have been shown to providevarying levels of biocontrol in the field (U.S. Pat. No. 5,165,928 andU.S. Pat. No. 4,996,157, both to Comell Research Foundation). Thesuccessful use of these biocontrol agents will be greatly enhanced bythe development of improved strains by the introduction of genes forAPSs. This could be accomplished by a number of ways which are wellknown in the art. One is protoplast mediated transformation of thefungus by PEG or electroporation-mediated techniques. Alternatively,particle bombardment can be used to transform protoplasts or otherfungal cells with the ability to develop into regenerated maturestructures. The vector pAN7-1, originally developed for Aspergillustransformation and now used widely for fungal transformation (Curragh etal., Mycol. Res. 97(3): 313-317 (1992); Tooley et al., Curr. Genet. 21:55-60 (1992); Punt et al., Gene 56: 117-124 (1987)) is engineered tocontain the pyrrolnitrin operon, or any other genes for APSbiosynthesis. This plasmid contains the E. coli the hygromycin Bresistance gene flanked by the Aspergillus nidulans gpd promoter and thetrpC terminator (Punt et al., Gene 56: 117-124 (1987)).

J. In Vitro Activity of Anti-phytopathogenic Substances Against PlantPathogens Example 30

Bioassay Procedures for the Detection of Antifungal Activity

Inhibition of fungal growth by a potential antifungal agent can bedetermined in a number of assay formats. Macroscopic methods which arecommonly used include the agar diffusion assay (Dhingra & Sinclair,Basic Plant Pathology Methods, CRC Press, Boca Raton, Fla. (1985)) andassays in liquid media (Broekaert et al., FEMS Microbiol. Lett. 69:55-60.(1990)). Both types of assay are performed with either fungalspores or mycelia as inocula. The maintenance of fungal stocks is inaccordance with standard mycological procedures. Spores for bioassay areharvested from a mature plate of a fungus by flushing the surface of theculture with sterile water or buffer. A suspension of mycelia isprepared by placing fungus from a plate in a blender and homogenizinguntil the colony is dispersed. The homogenate is filtered throughseveral layers of cheesecloth so that larger particles are excluded. Thesuspension which passes through the cheesecloth is washed bycentrifugation and replacing the supernalant with fresh buffer. Theconcentration of the mycelial suspension is adjusted empirically, bytesting the suspension in the bioassay to be used.

Agar diffusion assays may be performed by suspending spores or mycelialfragments in a solid test medium, and applying the antifungal agent at apoint source, from which it diffuses. This may be done by adding sporesor mycelia to melted fungal growth medium, then pouring the mixture intoa sterile dish and allowing it to gel. Sterile filters are placed on thesurface of the medium, and solutions of antifungal agents are spottedonto the filters. After the liquid has been absorbed by the filter, theplates are incubated at the appropriate temperature, usually for 1-2days. Growth inhibition is indicated by the presence of zones aroundfilters in which spores have not germinated, or in which mycelia havenot grown. The antifungal potency of the agent, denoted as the minimaleffective dose, may be quantified by spotting serial dilutions of theagent onto filters, and determining the lowest dose which gives anobservable inhibition zone. Another agar diffusion assay can beperformed by cutting wells into solidified fungal growth medium andplacing solutions of antifungal agents into them. The plate isinoculated at a point equidistant from all the wells, usually at thecenter of the plate, with either a small aliquot of spore or mycelialsuspension or a mycelial plug cut directly from a stock culture plate ofthe fungus. The plate is incubated for several days until the growingmycelia approach the wells, then it is observed for signs of growthinhibition. Inhibition is indicated by the deformation of the roughlycircular form which the fungal colony normally assumes as it grows.Specifically, if the mycelial front appears flattened or even concaverelative to the uninhibited sections of the plate, growth inhibition hasoccurred. A minimal effective concentration may be determined by testingdiluted solutions of the agent to find the lowest at which an effect canbe detected.

Bioassays in liquid media are conducted using suspensions of spores ormycelia which are incubated in liquid fungal growth media instead ofsolid media. The fungal inocula, medium, and antifungal agent are mixedin wells of a 96-well microtiter plate, and the growth of the fungus isfollowed by measuring the turbidity of the culturespectrophotometrically. Increases in turbidity correlate with increasesin biomass, and are a measure of fungal growth. Growth inhibition isdetermined by comparing the growth of the fungus in the presence of theantifungal agent with growth in its absence. By testing dilutedsolutions of antifungal inhibitor, a minimal inhibitory concentration oran EC₅₀ may be determined.

Example 31

Bioassay Procedures for the Detection of Antibacterial Activity

A number of bioassays may be employed to determine the antibacterialactivity of an unknown compound. The inhibition of bacterial growth insolid media may be assessed by dispersing an inoculum of the bacterialculture in melted medium and spreading the suspension evenly in thebottom of a sterile Petri dish. After the medium has gelled, sterilefilter disks are placed on the surface, and aliquots of the testmaterial are spotted onto them. The plate is incubated overnight at anappropriate temperature, and growth inhibition is observed as an areaaround a filter in which the bacteria have not grown, or in which thegrowth is reduced compared to the surrounding areas. Pure compounds maybe characterized by the determination of a minimal effective dose, thesmallest amount of material which gives a zone of inhibited growth. Inliquid media, two other methods may be employed. The growth of a culturemay be monitored by measuring the optical density of the culture, inactuality the scattering of incident light. Equal inocula are seededinto equal culture volumes, with one culture containing a known amountof a potential antibacterial agent. After incubation at an appropriatetemperature, and with appropriate aeration as required by the bacteriumbeing tested, the optical densities of the cultures are compared. Asuitable wavelength for the comparison is 600 nm. The antibacterialagent may be characterized by the determination of a minimal effectivedose, the smallest amount of material which produces a reduction in thedensity of the culture, or by determining an EC₅₀, the concentration atwhich the growth of the test culture is half that of the control. Thebioassays described above do not differentiate between bacteriostaticand bacteriocidal effects. Another assay can be performed which willdetermine the bacteriocidal activity of the agent. This assay is carriedout by incubating the bacteria and the active agent together in liquidmedium for an amount of time and under conditions which are sufficientfor the agent to exert its effect. After this incubation is completed,the bacteria may be either washed by centrifugation and resuspension, ordiluted by the addition of fresh medium. In either case, theconcentration of the antibacterial agent is reduced to a point at whichit is no longer expected to have significant activity. The bacteria areplated and spread on solid medium and the plates are incubated overnightat an appropriate temperature for growth. The number of colonies whicharise on the plates are counted, and the number which appeared from themixture which contained the antibacterial agent is compared with thenumber which arose from the mixture which contained no antibacterialagent. The reduction in colony-forming units is a measure of thebacteriocidal activity of the agent. The bacteriocidal activity may bequantified as a minimal effective dose, or as an EC₅₀, as describedabove. Bacteria which are used in assays such as these include speciesof Agrobacterium, Erwinia, Clavibacter, Xanthomonas, and Pseudomonas.

Example 32

Antipathogenic Activity Determination of APSs

APSs are assayed using the procedures of examples 30 and 31 above toidentify the range of fungi and bacteria against which they are active.The APS can be isolated from the cells and culture medium of the hostorganism normally producing it, or can alternatively be isolated from aheterologous host which has been engineered to produce the APS. Afurther possibility is the chemical synthesis of APS compounds of knownchemical structure, or derivatives thereof.

Example 33

Antimicriobial Activity Determination of Pyrrolnitrin

The anti-phytopathogenic activity of a fluorinated 3-cyano-derivative ofpyrrolnitrin (designated CGA173506) was observed against the maizefungal phytopathgens Diplodia maydis, Colletotrichum graminicola, andGibberella zeae-maydis. Spores of the fungi were harvested and suspendedin water. Approximately 1000 spores were inoculated into potato dextrosebroth and either CGA173506 or water in a total volume of 100 microlitersin the wells of 96-well microtiter plates suitable for a plate reader.The compound CGA173506 was obtained as a 50% wettable powder, and astock suspension was made up at a concentration of 10 mg/ml in sterilewater. This stock suspension was diluted with sterile water to providethe 173506 used in the tests. After the spores, medium, and 173506 weremixed, the turbidity in the wells was measured by reading the absorbanceat 600 nm in a plate reader. This reading was taken as the backgroundturbidity, and was subtracted from readings taken at later times. After46 hours of incubation, the presence of 1 microgram/ml of 173506 wasdetermined to reduce the growth of Diplodia maydis by 64%, and after 120hours, the same concentration of 173506 inhibited the growth ofColletotrichum graminicola by 50%. After 40 hours of incubation, thepresence of 0.5 microgram/ml of 173506 gave 100% inhibition ofGibberella zeae-maydis.

K. Expression of Antibiotic Biosynthetic Genes in Transgenic PlantsExample 34

Modification of Coding Sequences and Adjacent Sequences

The cloned APS biosynthetic genes described in this application can bemodified for expression in transgenic plant hosts. This is done with theaim of producing extractable quantities of APS from transgenic plants(i.e. for similar reasons to those described in Section E above), oralternatively the aim of such expression can be the accumulation of APSin plant tissue for the provision of pathogen protection on host plants.A host plant expressing genes for the biosynthesis of an APS and whichproduces the APS in its cells will have enhanced resistance tophytopathogen attack and will be thus better equipped to withstand croplosses associated with such attack.

The transgenic expression in plants of genes derived from microbialsources may require the modification of those genes to achieve andoptimize their expression in plants. In particular, bacterial ORFs whichencode separate enzymes but which are encoded by the same transcript inthe native microbe are best expressed in plants on separate transcripts.To achieve this, each microbial ORF is isolated individually and clonedwithin a cassette which provides a plant promoter sequence at the 5' endof the ORF and a plant transcriptional terminator at the 3' end of theORF. The isolated ORF sequence preferably includes the initiating ATGcodon and the terminating STOP codon but may include additional sequencebeyond the initiating ATG and the STOP codon. In addition, the ORF maybe truncated, but still retain the required activity; for particularlylong ORFs, truncated versions which retain activity may be preferablefor expression in transgenic organisms. By "plant promoter" and "planttranscriptional terminator" it is intended to mean promoters andtranscriptional terminators which operate within plant cells. Thisincludes promoters and transcription terminators which may be derivedfrom non-plant sources such as viruses (an example is the CauliflowerMosaic Virus).

In some cases, modification to the ORF coding sequences and adjacentsequence will not be required. It is sufficient to isolate a fragmentcontaining the ORF of interest and to insert it downstream of a plantpromoter. For example, Gaffney et al. (Science 261: 754-756 (1993)) haveexpressed the Pseudomonas nahG gene in transgenic plants under thecontrol of the CaMV 35S promoter and the CaMV tm1 terminatorsuccessfully without modification of the coding sequence and with x bpof the Pseudomonas gene upstream of the ATG still attached, and y bpdownstream of the STOP codon still attached to the nahG ORF. Preferablyas little adjacent microbial sequence should be left attached upstreamof the ATG and downstream of the STOP codon. In practice, suchconstruction may depend on the availability of restriction sites.

In other cases, the expression of genes derived from microbial sourcesmay provide problems in expression. These problems have been wellcharacterized in the art and are particularly common with genes derivedfrom certain sources such as Bacillus. These problems may apply to theAPS biosynthetic genes of this invention and the modification of thesegenes can be undertaken using techniques now well known in the art. Thefollowing problems may be encountered:

(1) Codon Usage. The preferred codon usage in plants differs from thepreferred codon usage in certain microorganisms. Comparison of the usageof codons within a cloned microbial ORF to usage in plant genes (and inparticular genes from the target plant) will enable an identification ofthe codons within the ORF which should preferably be changed. Typicallyplant evolution has tended towards a strong preference of thenucleotides C and G in the third base position of monocotyledons,whereas dicotyledons often use the nucleotides A or T at this position.By modifying a gene to incorporate preferred codon usage for aparticular target transgenic species, many of the problems describedbelow for GC/AT content and illegitimate splicing will be overcome.

(2) GC/AT Content. Plant genes typically have a GC content of more than35%. ORF sequences which are rich in A and T nucleotides can causeseveral problems in plants. Firstly, motifs of ATTTA are believed tocause destabilization of messages and are found at the 3' end of manyshort-lived mRNAs. Secondly, the occurrence of polyadenylation signalssuch as AATAAA at inappropriate positions within the message is believedto cause premature truncation of transcription. In addition,monocotyledons may recognize AT-rich sequences as splice sites (seebelow).

(3) Sequences Adjacent to the Initiating Methionine. Plants differ frommicroorganisms in that their messages do not possess a defined ribosomebinding site. Rather, it is believed that ribosomes attach to the 5' endof the message and scan for the first available ATG at which to starttranslation. Nevertheless, it is believed that there is a preference forcertain nucleotides adjacent to the ATG and that expression of microbialgenes can be enhanced by the inclusion of a eukaryotic consensustranslation initiator at the ATG. Clontech (1993/1994 catalog, page 210)have suggested the sequence GTCGACCATGGTC (SEQ ID NO:7) as a consensustranslation initiator for the expression of the E. coli uidA gene inplants. Further, Joshi (NAR 15: 6643-6653 (1987)) has compared manyplant sequences adjacent to the ATG and suggests the consensusTAAACAATGGCT (SEQ ID NO:8). In situations where difficulties areencountered in the expression of microbial ORFs in plants, inclusion ofone of these sequences at the initiating ATG may improve translation. Insuch cases the last three nucleotides of the consensus may not beappropriate for inclusion in the modified sequence due to theirmodification of the second AA residue. Preferred sequences adjacent tothe initiating methionine may differ between different plant species. Asurvey of 14 maize genes located in the GenBank database provided thefollowing results:

    __________________________________________________________________________    Position Before the Initiating ATG in 14 Maize Genes:                         -10   -9 -8 -7  -6 -5  -4 -3  -2 -1                                           __________________________________________________________________________    C  3  8  4  6   2  5   6  0   10 7                                            T  3  0  3  4   3  2   1  1   1  0                                            A  2  3  1  4   3  2   3  7   2  3                                            G  6  3  6  0   6  5   4  6   1  5                                            __________________________________________________________________________

This analysis can be done for the desired plant species into which APSgenes are being incorporated, and the sequence adjacent to the ATGmodified to incorporate the preferred nucleotides.

(4) Removal of Illegitimate Splice Sites. Genes cloned from non-plantsources and not optimized for expression in plants may also containmotifs which may be recognized in plants as 5' or 3' splice sites, andbe cleaved, thus generating truncated or deleted messages. These sitescan be removed using the techniques described in pending applicationSer. No. 07/961,944, hereby incorporated by reference.

Techniques for the modification of coding sequences and adjacentsequences are well known in the art. In cases where the initialexpression of a microbial ORF is low and it is deemed appropriate tomake alterations to the sequence as described above, then theconstruction of synthetic genes can be accomplished according to methodswell known in the art. These are, for example, described in thepublished patent disclosures EP 0 385 962 (to Monsanto), EP 0 359 472(to Lubrizol) and WO 93/07278 (to Ciba-Geigy). In most cases it ispreferable to assay the expression of gene constructions using transientassay protocols (which are well known in the art) prior to theirtransfer to transgenic plants.

Example 35

Construction of Plant Transformation Vectors

Numerous transformation vectors are available for plant transformation,and the genes of this invention can be used in conjunction with any suchvectors. The selection of vector for use will depend upon the preferredtransformation technique and the target species for transformation. Forcertain target species, different antibiotic or herbicide selectionmarkers may be preferred. Selection markers used routinely intransformation include the nptII gene which coffers resistance tokanamycin and related antibiotics (Messing & Vierra, Gene 19: 259-268(1982); Bevan et al., Nature 304: 184-187 (1983)), the bar gene whichcoffers resistance to the herbicide phosphinothricin (White et al., NuclAcids Res 18: 1062 (1990), Spencer et al. Theor Appl Genet 79:625-631(1990)), the hph gene which coffers resistance to the antibiotichygromycin (Blochinger & Diggelmann, Mol Cell Biol 4: 2929-2931), andthe dhfr gene, which coffers resistance to methatrexate (Bourouis etal., EMBO J. 2(7): 1099-1104 (1983)).

(1) Construction of Vectors Suitable for Agrobacterium Transformation

Many vectors are available for transformation using Agrobacteriumtumefaciens. These typically carry at least one T-DNA border sequenceand include vectors such as pBIN19 (Bevan, Nucl. Acids Res. (1984)) andpXYZ. Below the construction of two typical vectors is described.

Construction of pCIB200 and pCIB2001

The binary vectors pCIB200 and pCIB2001 are used for the construction ofrecombinant vectors for use with Agobacterium and was constructed in thefollowing manner. pTJS75kan was created by NarI digestion of pTJS75(Schmidhauser & Helinski, J Bacteriol. 164: 446455 (1985)) allowingexcision of the tetracycline-resistance gene, followed by insertion ofan AccI fragment from pUC4K carrying an NPTII (Messing & Vierra, Gene19: 259-268 (1982); Bevan et al., Nature 304: 184-187 (1983); McBride etal., Plant Molecular Biology 14: 266-276 (1990)). XhoI linkers wereligated to the EcoRV fragment of pCIB7 which contains the left and rightT-DNA borders, a plant selectable nos/nptlI chimeric gene and the pUCpolylinker (Rothstein et al., Gene 53: 153-161 (1987)), and theXhoI-digested fragment was cloned into SalI-digested pTJS75kan to createpCIB200 (see also EP 0 332 104, example 19). pCIB200 contains thefollowing unique polylinker restriction sites: EcoRI, SstI, KpnI, BglII,XbaI, and SalI. pCIB2001 is a derivative of pCIB200 which created by theinsertion into the polylinker of additional restriction sites. Uniquerestriction sites in the polylinker of pCIB2001 are EcoRI, SstI, KpnI,BglII, XbaI, SalI, MluI, BclI, AvrlI, ApaI, HpaI, and StuI. pCIB2001, inaddition to containing these unique restriction sites also has plant andbacterial kanamycin selection, left and right T-DNA borders forAgobacterium-mediated transformation, the RK2-derived trfA function formobilization between E. coli and other hosts, and the OriT and OriVfunctions also from RK2. The pCIB2001 polylinker is suitable for thecloning of plant expression cassettes containing their own regulatorysignals.

Construction of pCIB10 and Hygromycin Selection Derivatives thereof

The binary vector pCIB10 contains a gene encoding kanamycin resistancefor selection in plants, T-DNA right and left border sequences andincorporates sequences from the wide host-range plasmid pRK252 allowingit to replicate in both E. coli and Agrobacterium. Its construction isdescribed by Rothstein et al. (Gene 53: 153-161 (1987)). Variousderivatives of pCIB10 have been constructed which incorporate the genefor hygromycin B phosphotransferase described by Gritz et al. (Gene 25:179-188 (1983)). These derivatives enable selection of transgenic plantcells on hygromycin only (pCIB743 ), or hygromycin and kanamycin(pCIB715, pCIB717).

(2) Construction of Vectors Suitable for non-AgrobacteriumTransformation

Transformation without the use of Agrobacterium tumefaciens circumventsthe requirement for T-DNA sequences in the chosen transformation vectorand consequently vectors lacking these sequences can be utilized inaddition to vectors such as the ones described above which contain T-DNAsequences. Transformation techniques which do not rely on Agrobacteriuminclude transformation via particle bombardment, protoplast uptake (e.g.PEG and electroporation) and microinjection. The choice of vectordepends largely on the preferred selection for the species beingtransformed. Below, the construction of some typical vectors isdescribed.

Construction of pCIB3064

pCIB3064 is a pUC-derived vector suitable for direct gene transfertechniques in combination with selection by the herbicide basta (orphosphinothricin). The plasmid pCIB246 comprises the CaMV 35S promoterin operational fusion to the E. coli GUS gene and the CaMV 35Stranscriptional terminator and is described in the PCT publishedapplication WO 93/07278. The 35S promoter of this vector contains twoATG sequences 5' of the start site. These sites were mutated usingstandard PCR techniques in such a way as to remove the ATGs and generatethe restriction sites SspI and PvuII. The new restriction sites were 96and 37 bp away from the unique SalI site and 101 and 42 bp away from theactual start site. The resultant derivative of pCIB246 was designatedpCIB3025. The GUS gene was then excised from pCIB3025 by digestion withSalI and SacI, the termini rendered blunt and religated to generateplasmid pCIB3060. The plasmid pJIT82 was obtained from the John InnesCentre, Norwich and the a 400 bp SmaI fragment containing the bar genefrom Streptomyces viridochromogenes was excised and inserted into theHpaI site of pCIB3060 (Thompson et al. EMBO J 6: 2519-2523 (1987)).

This generated pCIB3064 which comprises the bar gene under the controlof the CaMV 35S promoter and terminator for herbicide selection, a genefro ampicillin resistance (for selection in E. coli) and a polylinkerwith the unique sites SphI, PstI, HindIII, and BamHI. This vector issuitable for the cloning of plant expression cassettes containing theirown regulatory signals.

Construction of pSOG19 and pSOG35

pSOG35 is a transformation vector which utilizes the E. coli genedihydrofolate reductase (DHFR) as a selectable marker conferringresistance to methotrexate. PCR was used to amplify the 35S promoter(˜800 bp), intron 6 from the maize Adh1 gene (˜550 bp) and 18 bp of theGUS untranslated leader sequence from pSOG10. A 250 bp fragment encodingthe E. coli dihydrofolate reductase type II gene was also amplified byPCR and these two PCR fragments were assembled with a SacI-PstI fragmentfrom pBI221 (Clontech) which comprised the pUC 19 vector backbone andthe nopaline synthase terminator. Assembly of these fragments generatedpSOG19 which contains the 35S promoter in fusion with the intron 6sequence, the GUS leader, the DHFR gene and the nopaline synthaseterminator. Replacement of the GUS leader in pSOG19 with the leadersequence from Maize Chlorotic Mottle Virus (MCMV) generated the vectorpSOG35. pSOG19 and pSOG35 carry the pUC gene for ampicillin resistanceand have HindIII, SphI, PstI and EcoRI sites available for the cloningof foreign sequences.

Example 36

Requirements for Construction of Plant Expression Cassettes

Gene sequences intended for expression in transgenic plants are firstlyassembled in expression cassettes behind a suitable promoter andupstream of a suitable transcription terminator. These expressioncassettes can then be easily transferred to the plant transformationvectors described above in example B.

Promoter Selection

The selection of promoter used in expression cassettes will determinethe spatial and temporal expression pattern of the transgene in thetransgenic plant. Selected promoters will express transgenes in specificcell types (such as leaf epidermal cells, meosphyll cells, root cortexcells) or in specific tissues or organs (roots, leaves or flowers, forexample) and this selection will reflect the desired location ofbiosynthesis of the APS. Alternatively, the selected promoter may driveexpression of the gene under a light-induced or other temporallyregulated promoter. A further alternative is that the selected promoterbe chemically regulated. This would provide the possibility of inducingthe induction of the APS only when desired and caused by treatment witha chemical inducer.

Transcriptional Terminators

A variety of transcriptional terminators are available for use inexpression cassettes. These are responsible for the termination oftranscription beyond the transgene and its correct polyadenylation.Appropriate transcriptional terminators and those which are known tofunction in plants and include the CaMV 35S terminator, the tm1terminator, the nopaline synthase terminator, the pea rbcS E9terminator. These can be used in both monocoylyedons and dicotyledons.

Sequences for the Enhancement or Regulation of Expression

Numerous sequences have been found to enhance gene expression fromwithin the transcriptional unit and these sequences can be used inconjunction with the genes of this invention to increase theirexpression in transgenic plants.

Various intron sequences have been shown to enhance expression,particularly in monocotyledonous cells. For example, the introns of themaize Adh1 gene have been found to significantly enhance the expressionof the wild-type gene under its cognate promoter when introduced intomaize cells. Intron 1 was found to be particularly effective andenhanced expression in fusion constructs with the chloramphenicolacetyltransferase gene (Callis et al., Genes Develep 1: 1183-1200(1987)). In the same experimental system, the intron from the maizebronze1 gene had a similar effect in enhancing expression (Callis etal., supra). Intron sequences have been routinely incorporated intoplant transformation vectors, typically within the non-translatedleader.

A number of non-translated leader sequences derived from viruses arealso known to enhance expression, and these are particularly effectivein dicotyledonous cells. Specifically, leader sequences from TobaccoMosaic Virus (TMV, the "Ω-sequence"), Maize Chlorotic Mottle Virus(MCMV), and Alfalfa Mosaic Virus (AMV) have been shown to be effectivein enhancing expression (e.g. Gallie et al. Nucl. Acids Res. 15:8693-8711 (1987); Skuzeski et al. Plant Molec. Biol. 15; 65-79 (1990))

Targeting of the Gene Product Within the Cell

Various mechanisms for targeting gene products are known to exist inplants and the sequences controlling the functioning of these mechanismshave been characterized in some detail. For example, the targeting ofgene products to the chloroplast is controlled by a signal sequencefound at the aminoterminal end of various proteins and which is cleavedduring chloroplast import yielding the mature protein (e.g. Comai et al.J. Biol. Chem. 263: 15104-15109 (1988)). These signal sequences can befused to heterologous gene products to effect the import of heterologousproducts into the chloroplast (van den Broeck et al. Nature 313: 358-363(1985)). DNA encoding for appropriate signal sequences can be isolatedfrom the 5' end of the cDNAs encoding the RUBISCO protein, the CABprotein, the EPSP synthase enzyme, the GS2 protein and many otherproteins which are known to be chloroplast localized.

Other gene products are localized to other organelles such as themitochondrion and the peroxisome (e.g. Unger et al. Plant Molec. Biol.13: 411-418 (1989)). The cDNAs encoding these products can also bemanipulated to effect the targeting of heterologous gene products tothese organelles. Examples of such sequences are the nuclear-encodedATPases and specific aspartate amino transferase isoforms formitochondria. Targeting to cellular protein bodies has been described byRogers et al. (Proc. Natl. Acad. Sci. USA 82: 6512-6516 (1985)).

In addition sequences have been characterized which cause the targetingof gene products to other cell compartments. Aminoterminal sequences areresponsible for targeting to the ER, the apoplast, and extracellularsecretion from aleurone cells (Koehler & Ho, Plant Cell 2: 769-783(1990)). Additionally, aminoterminal sequences in conjunction withcarboxyterminal sequences are responsible for vacuolar targeting of geneproducts (Shinshi et al. Plant Molec. Biol. 14: 357-368 (1990)).

By the fusion of the appropriate targeting sequences described above totransgene sequences of interest it is possible to direct the transgeneproduct to any organelle or cell compartment. For chloroplast targeting,for example, the chloroplast signal sequence from the RUBISCO gene, theCAB gene, the EPSP synthase gene, or the GS2 gene is fused in frame tothe aminoterminal ATG of the transgene. The signal sequence selectedshould include the known cleavage site and the fusion constructed shouldtake into account any amino acids after the cleavage site which arerequired for cleavage. In some cases this requirement may be fulfilledby the addition of a small number of amino acids between the cleavagesite and the transgene ATG or alternatively replacement of some aminoacids within the transgene sequence. Fusions constructed for chloroplastimport can be tested for efficacy of chloroplast uptake by in vitrotranslation of in vitro transcribed constructions followed by in vitrochloroplast uptake using techniques described by (Bartlett et al. In:Edelmann et al. (Eds.) Methods in Chloroplast Molecular Biology,Elsevier. pp 1081-1091 (1982); Wasmann et al. Mol. Gen. Genet. 205:446-453 (1986)). These construction techniques are well known in the artand are equally applicable to mitochondria and peroxisomes. The choiceof targeting which may be required for APS biosynthetic genes willdepend on the cellular localization of the precursor required as thestarting point for a given pathway. This will usually be cytosolic orchloroplastic, although it may is some cases be mitochondrial orperoxisomal. The gene products of APS biosynthetic genes will notnormally require targeting to the ER, the apoplast or the vacuole.

The above described mechanisms for cellular targeting can be utilizednot only in conjunction with their cognate promoters, but also inconjunction with heterologous promoters so as to effect a specific celltargeting goal under the transcriptional regulation of a promoter whichhas an expression pattern different to that of the promoter from whichthe targeting signal derives.

Example 37

Examples of Expression Cassette Construction

The present invention encompasses the expression of genes encoding APSsunder the regulation of any promoter which is expressible in plants,regardless of the origin of the promoter.

Furthermore, the invention encompasses the use of any plant-expressiblepromoter in conjunction with any further sequences required or selectedfor the expression of the APS gene. Such sequences include, but are notrestricted to, transcriptional terminators, extraneous sequences toenhance expression (such as introns [e.g. Adh intron 1], vital sequences[e.g. TMV-Ω]), and sequences intended for the targeting of the geneproduct to specific organelles and cell compartments.

Constitutive Expression: the CaMV 35S Promoter

Construction of the plasmid pCGN1761 is described in the publishedpatent application EP 0 392 225 (example 23). pCGN1761 contains the"double" 35S promoter and the tm1 transcriptional terminator with aunique EcoRI site between the promoter and the terminator and has apUC-type backbone. A derivative of pCGN1761 was constructed which has amodified polylinker which includes NotI and XhoI sites in addition tothe existing EcoRI site. This derivative was designated pCGN1761ENX.pCGN1761ENX is useful for the cloning of eDNA sequences or genesequences (including microbial ORF sequences) within its polylinker forthe purposes of their expression under the control of the 35S promoterin transgenic plants. The entire 35S promoter-gene sequence-tm1terminator cassette of such a construction can be excised by HindIII,SphI, SalI, and XbaI sites 5' to the promoter and XbaI, BamHI and BglIsites 3' to the terminator for transfer to transformation vectors suchas those described above in example 35. Furthermore, the double 35Spromoter fragment can be removed by 5' excision with HindIII, SphI,SalI, XbaI, or PstI, and 3' excision with any of the polylinkerrestriction sites (EcoRI, NotI or XhoI) for replacement with anotherpromoter.

Modification of pCGN1761ENX by Optimization of the TranslationalInitiation Site

For any of the constructions described in this section, modificationsaround the cloning sites can be made by the introduction of sequenceswhich may enhance translation. This is particularly useful when genesderived from microorganisms are to be introduced into plant expressioncassettes as these genes may not contain sequences adjacent to theirinitiating methionine which may be suitable for the initiation oftranslation in plants. In cases where genes derived from microorganismsare to be cloned into plant expression cassettes at their ATG it may beuseful to modify the site of their insertion to optimize theirexpression. Modification of pCGN1761ENX is described by way of exampleto incorporate one of several optimized sequences for plant expression(e.g. Joshi, supra).

pCGN1761ENX is cleaved with SphI, treated with T4 DNA polymerase andreligated, thus destroying the SphI site located 5' to the double 35Spromoter. This generates vector pCGN1761ENX/Sph-. pCGN1761ENX/Sph- iscleaved with EcoRI, and ligated to an annealed molecular adaptor of thesequence 5'-AATTCTAAAGCATGCCGATCGG-3'(SEQ IDNO:9)/5'-AATTCCGATCGGCATGCTTTA-3' (SEQ ID NO:10). This generates thevector pCGNSENX which incorporates the quasi-optimized planttranslational initiation sequence TAAA-C adjacent to the ATG which isitself part of an SphI site which is suitable for cloning heterologousgenes at their initiating methionine. Downstream of the SphI site, theEcoRI, NotI, and XhoI sites are retained.

An alternative vector is constructed which utilizes an NcoI site at theinitiating ATG. This vector, designated pCGN176 1 NENX is made byinserting an annealed molecular adaptor of the sequence5'-AATTCTAAACCATGGATCGG-3' (SEQ ID NO:11)/5'AATTCCGATCGCCATGGTTTA-3'(SEQ ID NO:12) at the pCGN1761ENX EcoRI site (Sequence ID's 14 & 15).Thus, the vector includes the quasi-optimized sequence TAAACC adjacentto the initiating ATG which is within the Ncol site. Downstream sitesare EcoRI, NotI, and XhoI. Prior to this manipulation, however, the twoNcoI sites in the pCGN1761ENX vector (at upstream positions of the 5'35S promoter unit) are destroyed using similar techniques to thosedescribed above for SphI or alternatively using "inside-outside" PCR(Innes et al. PCR Protocols: A guide to methods and applications.Academic Press, New York (1990); see Example 41). This manipulation canbe assayed for any possible detrimental effect on expression byinsertion of any plant eDNA or reporter gene sequence into the cloningsite followed by routine expression analysis in plants.

Expression under a Chemically Regularable Promoter

This section describes the replacement of the double 35S promoter inpCGN1761ENX with any promoter of choice; by way of example thechemically regulated PR-1a promoter is described. The promoter of choiceis preferably excised from its source by restriction enzymes, but canalternatively be PCR-amplified using primers which carry appropriateterminal restriction sites. Should PCR-amplification be undertaken, thenthe promoter should be resequenced to check for amplification errorsafter the cloning of the amplified promoter in the target vector. Thechemically regulatable tobacco PR-I a promoter is cleaved from plasmidpCIB1004 (see EP 0 332 104, example 21 for construction) and transferredto plasmid pCGN1761ENX. pCIB1004 is cleaved with NcoI and the resultant3' overhang of the linearized fragment is rendered blunt by treatmentwith T4 DNA polymerase. The fragment is then cleaved with HindIII andthe resultant PR-1a promoter containing fragment is gel purified andcloned into pCGN1761ENX from which the double 35S promoter has beenremoved. This is done by cleavage with XhoI and blunting with T4polymerase, followed by cleavage with HindIII and isolation of thelarger vector-terminator containing fragment into which the pCIB1004promoter fragment is cloned. This generates a pCGN1761ENX derivativewith the PR-1a promoter and the trn1 terminator and an interveningpolylinker with unique EcoRI and NotI sites. Selected APS genes can beinserted into this vector, and the fusion products (i.e.promoter-gene-terminator) can subsequently be transferred to anyselected transformation vector, including those described in thisapplication.

Constitutive Expression: the Actin Promoter

Several isoforms of actin are known to be expressed in most cell typesand consequently the actin promoter is a good choice for a constitutivepromoter. In particular, the promoter from the rice Act1 gene has beencloned and characterized (McElroy et al. Plant Cell 2: 163-171 (1990)).A 1.3 kb fragment of the promoter was found to contain all theregulatory elements required for expression in rice protoplasts.Furthermore, numerous expression vectors based on the Act1 promoter havebeen constructed specifically for use in monocotyledons (McElroy et al.Mol. Gen. Genet. 231: 150-160 (1991)). These incorporate the Act1-intron1, Adh1 5' flanking sequence and Adh1-intron 1 (from the maize alcoholdehydrogenase gene) and sequence from the CaMV 35S promoter. Vectorsshowing highest expression were fusions of 35S and the Act1 intron orthe Act1 5' flanking sequence and the Act1 intron. Optimization ofsequences around the initiating ATG (of the GUS reporter gene) alsoenhanced expression. The promoter expression cassettes described byMeElroy et al. (Mol. Gen. Genet. 231: 150-160 (1991)) can be easilymodified for the expression of APS biosynthetic genes and areparticularly suitable for use in monocotyledonous hosts. For example,promoter containing fragments can be removed from the McElroyconstructions and used to replace the double 35S promoter inpCGN1761ENX, which is then available for the insertion or specific genesequences. The fusion genes thus constructed can then be transferred toappropriate transformation vectors. In a separate report the rice Act1promoter with its first intron has also been found to direct highexpression in cultured barley cells (Chibbar et al. Plant Cell Rep. 12:506-509 (1993)).

Constitutive Expression: the Ubiquitin Promoter

Ubiquitin is another gene product known to accumulate in many call typesand its promoter has been cloned from several species for use intransgenic plants (e.g. sunflower--Binet et al. Plant Science 79: 87-94(1991), maize--Christensen et al. Plant Molec. Biol. 12: 619-632(1989)). The maize ubiquitin promoter has been developed in transgenicmonocot systems and its sequence and vectors constructed for monocottransformation are disclosed in the patent publication EP 0 342 926 (toLubrizol). Further, Taylor et al. (Plant Cell Rep. 12: 491-495 (1993))describe a vector (pAHC25) which comprises the maize ubiquitin promoterand first intron and its high activity in cell suspensions of numerousmonocotyledons when introduced via microprojectile bombardment. Theubiquitin promoter is clearly suitable for the expression of APSbiosynthetic genes in transgenic plants, especially monocotyledons.Suitable vectors are derivatives of pAHC25 or any of the transformationvectors described in this application, modified by the introduction ofthe appropriate ubiquitin promoter and/or intron sequences.

Root Specific Expression

A preferred pattern of expression for the APSs of the instant inventionis root expression. Root expression is particularly useful for thecontrol of soil-borne phytopathogens such as Rhizoctonia and Pythium.Expression of APSs only in root tissue would have the advantage ofcontrolling root invading phytopathogens, without a concomitantaccumulation of APS in leaf and flower tissue and seeds. A suitable rootpromoter is that described by de Framond (FEBS 290: 103-106 (1991)) andalso in the published patent application EP 0 452 269 (to Ciba-Geigy).This promoter is transferred to a suitable vector such as pCGN1761ENXfor the insertion of an APS gene of interest and subsequent transfer ofthe entire promoter-gene-terminator cassette to a transformation vectorof interest.

Wound Inducible Promoters

Wound-inducible promoters are particularly suitable for the expressionof APS biosynthetic genes because they are typically active not just onwound induction, but also at the sites of phytopathogen infection.Numerous such promoters have been described (e.g. Xu et al. Plant Molec.Biol. 22: 573-588 (1993), Logemann et al. Plant Cell 1: 151-158 (1989),Rohrmeier & Lehle, Plant Molec. Biol. 22: 783-792 (1993), Firek et al.Plant Molec. Biol. 22: 129-142 (1993), Warner et al. Plant J. 3: 191-201(1993)) and all are suitable for use with the instant invention.Logemann et al. (supra) describe the 5' upstream sequences of thedicotyledonous potato wun1 gene. Xu et al. (supra) show that a woundinducible promoter from the dicotyledon potato (pin2) is active in themonocotyledon rice. Further, Rohrmeier & Lehle (supra) describe thecloning of the maize Wip1 cDNA which is wound induced and which can beused to isolated the cognate promoter using standard techniques.Similarly, Firek et al. (supra) and Warner et al. (supra) have describeda wound induced gene from the monocotyledon Asparagus officinalis whichis expressed at local wound and pathogen invasion sites. Using cloningtechniques well known in the art, these promoters can be transferred tosuitable vectors, fused to the APS biosynthetic genes of this invention,and used to express these genes at the sites of phytopathogen infection.

Pith Preferred Expression

Patent Application WO 93/07278 (to Ciba-Geigy) describes the isolationof the maize trpA gene which is preferentially expressed in pith cells.The gene sequence and promoter extend up to -1726 from the start oftranscription are presented. Using standard molecular biologicaltechniques, this promoter or pans thereof, can be transferred to avector such as pCGN1761 where it can replace the 35S promoter and beused to drive the expression of a foreign gene in a pith-preferredmanner. In fact fragments containing the pith-preferred promoter orparts thereof can be transferred to any vector and modified for utilityin transgenic plants.

Pollen-Specific Expression

Patent Application WO 93/07278 (to Ciba-Geigy) further describes theisolation of the maize calcium-dependent protein kinase (CDPK) genewhich is expressed in pollen cells. The gene sequence and promoterextend up to 1400 bp from the start of transcription. Using standardmolecular biological techniques, this promoter or parts thereof, can betransferred to a vector such as pCGN1761 where it can replace the 35Spromoter and be used to drive the expression of a foreign gene in apollen-specific manner. In fact fragments containing the pollen-specificpromoter or parts thereof can be transferred to any vector and modifiedfor utility in transgenic plants.

Leaf-Specific Expression

A maize gene encoding phosphoenol carboxylase (PEPC) has been describedby Hudspeth & Grula (Plant Molec Biol 12: 579-589 (1989)). Usingstandard molecular biological techniques the promoter for this gene canbe used to drive the expression of any gene in a leaf-specific manner intransgenic plants.

Expression with Chloroplast Targeting

Chen & Jagendorf(J. Biol. Chem. 268: 2363-2367 (1993) have described thesuccessful use of a chloroplast transit peptide for import of aheterologous transgene. This peptide used is the transit

Expression with Chloroplast Targeting

Chen & Jagendorf(J. Biol. Chem. 268: 2363-2367 (1993) have described thesuccessful use of a chloroplast transit peptide for import of aheterologous transgene. This peptide used is the transit peptide fromthe rbcS gene from Nicotiana plumbaginifolia (Poulsen et al. Mol. Gen.Genet. 205: 193-200 (1986)). Using the restriction enzymes DraI andSphI, or Tsp509I and SphI the DNA sequence encoding this transit peptidecan be excised from plasmid prbcS-8B (Poulsen et al. supra) andmanipulated for use with any of the constructions described above. TheDraI-SphI fragment extends from -58 relative to the initiating rbcS ATGto, and including, the first amino acid (also a methionine) of themature peptide immediately after the import cleavage site, whereas theTsp509I-SphI fragment extends from -8 relative to the initiating rbcSATG to, and including, the first amino acid of the mature peptide. Thus,these fragment can be appropriately inserted into the polylinker of anychosen expression cassette generating a transcriptional fusion to theuntranslated leader of the chosen promoter (e.g. 35S, PR-1a, actin,ubiquitin etc.), whilst enabling the insertion of a required APS gene incorrect fusion downstream of the transit peptide. Constructions of thiskind are routine in the art. For example, whereas the DraI end isalready blunt, the 5' Tsp5091 site may be rendered blunt by T4polymerase treatment, or may alternatively be ligated to a linker oradaptor sequence to facilitate its fusion to the chosen promoter. The 3'SphI site may be maintained as such, or may alternatively be ligated toadaptor of linker sequences to facilitate its insertion into the chosenvector in such a way as to make available appropriate restriction sitesfor the subsequent insertion of a selected APS gene. Ideally the ATG ofthe SphI site is maintained and comprises the first ATG of the selectedAPS gene. Chen & Jagendorf (supra) provide consensus sequences for idealcleavage for chloroplast import, and in each case a methionine ispreferred at the first position of the mature protein. At subsequentpositions there is more variation and the amino acid may not be socritical. In any case, fusion constructions can be assessed forefficiency of import in vitro using the methods described by Bartlett etal. (In: Edelmann et al. (Eds.) Methods in Chloroplast MolecularBiology, Elsevier. pp 1081-1091 (1982)) and Wasmann et al. (Mol. Gen.Genet. 205: 446-453 (1986)). Typically the best approach may be togenerate fusions using the selected APS gene with no modifications atthe aminoterminus, and only to incorporate modifications when it isapparent that such fusions are not chloroplast imported at highefficiency, in which case modifications may be made in accordance withthe established literature (Chen & Jagendorf, supra; Wasman et at.,supra; Ko & Ko, J. Biol. Chem. 267: 13910-13916 (1992)).

A preferred vector is constructed by transferring the DraI-SphI transitpeptide encoding fragment from prbcS-8B to the cloning vectorpCGN1761ENX/Sph-. This plasmid is cleaved with EcoRI and the terminirendered blunt by treatment with T4 DNA polymerase. Plasmid prbcS-8B iscleaved with SphI and ligated to an annealed molecular adaptor of thesequence 5'-CCAGCTGGAATTCCG-3' (SEQ ID NO:13) /5'-CGGAATTCCAGCTGGCATG-3'(SEQ ID NO:14). The resultant product is 5'-terminally phosphorylated bytreatment with T4 kinase. Subsequent cleavage with DraI releases thetransit peptide encoding fragment which is ligated into the blunt-endex-EcoRI sites of the modified vector described above. Clones orientedwith the 5' end of the insert adjacent to the 3' end of the 35S promoterare identified by sequencing. These clones carry a DNA fusion of the 35Sleader sequence to the rbcS-8A promoter-transit peptide sequenceextending from -58 relative to the rbcS ATG to the ATG of the matureprotein, and including at that position a unique SphI site, and a newlycreated EcoRI site, as well as the existing NotI and XhoI sites ofpCGN1761ENX. This new vector is designated pCGN176 I/CT. DNA sequencesare transferred to pCGN 1761/CT in frame by amplification using PCRtechniques and incorporation of an SphI, NsphI, or NlaIII site at theamplified ATG, which following restriction enzyme cleavage with theappropriate enzyme is ligated into SphI-cleaved pCGN1761/CT. Tofacilitate construction, it may be required to change the second aminoacid of cloned gene, however, in almost all cases the use of PCRtogether with standard site directed mutagenesis will enable theconstruction of any desired sequence around the cleavage site and firstmethionine of the mature protein.

A further preferred vector is constructed by replacing the double 35Spromoter of pCGN1761ENX with the BamHI-SphI fragment of prbcS-8A whichcontains the full-length light regulated rbcS-8A promoter from -1038(relative to the transcriptional start site) up to the first methionineof the mature protein. The modified pCGN1761 with the destroyed SphIsite is cleaved with PstI and EcoRI and treated with T4 DNA polymeraseto render termini blunt. prbcS-8A is cleaved SphI and ligated to theannealed molecular adaptor of the sequence described above. Theresultant product is 5'-terminally phosphorylated by treatment with T4kinase. Subsequent cleavage with BamHI releases the promoter-transitpeptide containing fragment which is treated with T4 DNA polymerase torender the BamHI terminus blunt. The promoter-transit peptide fragmentthus generated is cloned into the prepared pCGN1761ENX vector,generating a construction comprising the rbcS-8A promoter and transitpeptide with an SphI site located at the cleavage site for insertion ofheterologous genes. Further, downstream of the SphI site there are EcoRI(re-created), NotI, and XhoI cloning sites. This construction isdesignated pCGN1761rbcS/CT.

Similar manipulations can be undertaken to utilize other GS2 chloroplasttransit peptide encoding sequences from other sources (monocotyledonousand dicotyledonous) and from other genes. In addition, similarprocedures can be followed to achieve targeting to other subcellularcompartments such as mitochondria.

Example 38

Techniques for the Isolation of New Promoters Suitable for theExpression of APS Genes

New promoters are isolated using standard molecular biologicaltechniques including any of the techniques described below. Onceisolated, they are fused to reporter genes such as GUS or LUC and theirexpression pattern in transgenic plants analyzed (Jefferson et al. EMBOJ. 6: 3901-3907 (1987); Ow et al. Science 234: 856-859 (1986)).Promoters which show the desired expression pattern are fused to APSgenes for expression in planta.

Subtractive cDNA Cloning

Subtractive cDNA cloning techniques are useful for the generation ofcDNA libraries enriched for a particular population of mRNAs (e.g. Haraet al. Nucl. Acids Res. 19: 1097-7104 (1991)).

Recently, techniques have been described which allow the construction ofsubtractive libraries from small amounts of tissue (Sharma et al.Biotechniques 15: 610-612 (1993)). These techniques are suitable for theenrichment of messages specific for tissues which may be available onlyin small amounts such as the tissue immediately adjacent to wound orpathogen infection sites.

Differential Screening by Standard PlusMinus Techniques

λ phage carrying cDNAs derived from different RNA populations (viz. rootversus whole plant, stem specific versus whole plant, local pathogeninfection points versus whole plant, etc.) are plated at low density andtransferred to two sets of hybridization filters (for a review ofdifferential screening techniques see Calvet, Pediatr. Nephrol. 5:751-757 (1991). cDNAs derived from the "choice" KNA population arehybridized to the first set and cDNAs from whole plant RNA arehybridized to the second set of filters. Plaques which hybridize to thefirst probe, but not to the second, are selected for further evaluation.They are picked and their cDNA used to screen Northern blots of "choice"RNA versus RNA from various other tissues and sources. Clones showingthe required expression pattern are used to clone gene sequences from agenomic library to enable the isolation of the cognate promoter. Between500 and 5000 bp of the cloned promoter is then fused to a reporter gene(e.g. GUS, LUC) and reintroduced into transgenic plants for expressionanalysis.

Differential Screening by Differential Display

RNA is isolated from different sources i.e. the choice source and wholeplants as control, and subjected to the differential display techniqueof Liang and Pardee (Science 257: 967-971 (1992)). Amplified fragmentswhich appear in the choice RNA, but not the control are gel purified andused as probes on Northern blots carrying different RNA samples asdescribed above. Fragments which hybridize selectively to the requiredRNA are cloned and used as probes to isolate the cDNA and also a genomicDNA fragment from which the promoter can be isolated. The isolatedpromoter is fused to a the GUS or LUC reporter gene as described aboveto assess its expression pattern in transgenic plants.

Promoter Isolation Using "Promoter Trap" Technology

The insertion of promoterless reporter genes into transgenic plants canbe used to identify sequences in a host plant which drive expression indesired cell types or with a desired strength. Variations of thistechnique is described by Ott & Chua (Mol. Gen. Genet. 223: 169-179(1990)) and Kertbundit et al. (Proc. Natl. Acad. Sci. USA 88: 5212-5216(1991)). In standard transgenic experiments the same principle can beextended to identify enhancer elements in the host genome where aparticular transgene may be expressed at particularly high levels.

Example 39

Transformation of Dicotyledons

Transformation techniques for dicotyledons are well known in the art andinclude Agrobacterium-based techniques and techniques which do notrequire Agrobacterium. Non-Agrobacterium techniques involve the uptakeof exogenous genetic material directly by protoplasts or cells. This canbe accomplished by PEG or electroporation mediated uptake, particlebombardment-mediated delivery, or microinjection. Examples of thesetechniques are described by Paszkowski et al., EMBO J 3: 2717-2722(1984), Potrykus et at., Mol. Gen. Genet. 199: 169-177 (1985), Reich etal., Biotechnology 4: 1001-1004 (1986), and Klein et al., Nature 327:70-73 (1987). In each case the transformed cells are regenerated towhole plants using standard techniques known in the art.

Agrobacterium-mediated transformation is a preferred technique fortransformation of dicotyledons because of its high efficiency oftransformation and its broad utility with many different species. Themany crop species which are routinely transformable by Agrobacteriuminclude tobacco, tomato, sunflower, cotton, oilseed rape, potato,soybean, alfalfa and poplar (EP 0 317 511 (cotton [1313]), EP 0 249 432(tomato, to Calgene), WO 87/07299 (Brassica, to Calgene), U.S. Pat. No.4,795,855 (poplar)). Agrobacterium transformation typically involves thetransfer of the binary vector carrying the foreign DNA of interest (e.g.pCIB200 or pCIB2001) to an appropriate Agobacterium strain which maydepend of the complement of vir genes carried by the host Agrobacteriumstrain either on a co-resident Ti plasmid or chromosomally (e.g. strainCIB542 for pCIB200 and pCIB2001 (Uknes et al. Plant Cell 5: 159-169(1993)). The transfer of the recombinant binary vector to Agobacteriumis accomplished by a triparental mating procedure using E. coli carryingthe recombinant binary vector, a helper E. coli strain which carries aplasmid such as pRK2013 and which is able to mobilize the recombinantbinary vector to the target Agobacterium strain. Alternatively, therecombinant binary vector can be transferred to Agobacterium by DNAtransformation (Hofgen & Willmitzer, Nucl. Acids Res. 16: 9877(1988)).

Transformation of the target plant species by recombinant Agobacteriumusually involves co-cultivation of the Agrobacterium with explants fromthe plant and follows protocols well known in the art. Transformedtissue is regenerated on selectable medium carrying the antibiotic orherbicide resistance marker present between the binary plasmid T-DNAborders.

Example 40

Transformation of Monocotyledons

Transformation of most monocotyledon species has now also becomeroutine. Preferred techniques include direct gene transfer intoprotoplasts using PEG or electropotation techniques, and particlebombardment into callus tissue. Transformations can be undertaken with asingle DNA species or multiple DNA species (i.e. co-transformation) andboth these techniques are suitable for use with this invention.Co-transformation may have the advantage of avoiding complex vectorconstruction and of generating transgenic plants with unlinked loci forthe gene of interest and the selectable marker, enabling the removal ofthe selectable marker in subsequent generations, should this be regardeddesirable. However, a disadvantage of the use of co-transformation isthe less than 100% frequency with which separate DNA species areintegrated into the genome (Schocher et al. Biotechnology 4: 1093-1096(1986)).

Patent Applications EP 0 292 435 ([1280/1281] to Ciba-Geigy), EP 0 392225 (to Ciba-Geigy) and WO 93/07278 (to Ciba-Geigy) describe techniquesfor the preparation of callus and protoplasts from an elite inbred lineof maize, transformation of protoplasts using PEG or electroporation,and the regeneration of maize plants from transformed protoplasts.Gordon-Kamm et al. (Plant Cell 2: 603-618 (1990)) and Fromm et al.(Biotechnology 8: 833-839 (1990)) have published techniques fortransformation of A188-derived maize line using particle bombardment.Furthermore, application Ser. No. 93/07278 (to Ciba-Geigy) and Koziel etal. (Biotechnology 11: 194-200 (1993)) describe techniques for thetransformation of elite inbred lines of maize by particle bombardment.This technique utilizes immature maize embryos of 1.5-2.5 mm lengthexcised from a maize ear 14-15 days after pollination and a PDS-1000HeBiolistics device for bombardment.

Transformation of rice can also be undertaken by direct gene transfertechniques utilizing protoplasts or particle bombardment.Protoplast-mediated transformation has been described for Japonica-typesand Indica-types (Zhang et al., Plant Cell Rep 7: 379-384 (1988);Shimamoto et al. Nature 338: 274-277 (1989); Datta et al. Biotechnology8: 736-740 (1990)). Both types are also routinely transformable usingparticle bombardment (Christou et al. Biotechnology 9: 957-962 (1991)).

Patent Application EP 0 332 581 (to Ciba-Geigy) describes techniques forthe generation, transformation and regeneration of Pooideae protoplasts.These techniques allow the transformation of Dactylis and wheat.Furthermore, wheat transformation was been described by Vasil et al.(Biotechnology 10: 667-674 (1992)) using particle bombardment into cellsof type C long-term regenerable callus, and al so by Vasil et al.(Biotechnology 11: 1553-1558 (1993)) and Weeks et al. (Plant Physiol.102: 1077-1084 (1993)) using particle bombardment of immature embryosand immature embryo-derived callus. A preferred technique for wheattransformation, however, involves the transformation of wheat byparticle bombardment of immature embryos and includes either a highsucrose or a high maltose step prior to gene delivery. Prior tobombardment, any number of embryos (0.75-1 mm in length) are plated ontoMS medium with 3% sucrose (Murashiga & Skoog, Physiologia Plantarum 15:473-497 (1962)) and 3 mg/l 2,4-D for induction of somatic embryos whichis allowed to proceed in the dark. On the chosen day of bombardment,embryos are removed from the induction medium and placed onto theosmoticum (i.e. induction medium with sucrose or maltose added at thedesired concentration, typically 15%). The embryos are allowed toplasmolyze for 2-3 h and are then bombarded. Twenty embryos per targetplate is typical, although not critical. An appropriate gene-carryingplasmid (such as pCIB3064 or pSG35) is precipitated onto micrometer sizegold particles using standard procedures. Each plate of embryos is shotwith the DuPont Biolistics® helium device using a burst pressure of˜1000 psi using a standard 80 mesh screen. After bombardment, theembryos are placed back into the dark to recover for about 24 h (stillon osmoticum). After 24 hrs, the embryos are removed from the osmoticumand placed back onto induction medium where they stay for about a monthbefore regeneration. Approximately one month later the embryo explantswith developing embryogenic callus are transferred to regenerationmedium (MS+1 mg/liter NAA, 5 mg/liter GA), further containing theappropriate selection agent (10 mg/l basta in the case of pCIB3064 and 2mg/l methotrexate in the case of pSOG35). After approximately one month,developed shoots are transferred to larger sterile containers known as"GA7s" which contained half-strength MS, 2% sucrose, and the sameconcentration of selection agent. Patent application Ser. No. 08/147,161describes methods for wheat transformation and is hereby incorporated byreference.

Example 41

Expression of Pyrrolnitrin in Transgenic Plants

The GC content of all four pyrrolnitrin ORFs is between 62 and 68% andconsequently no AT-content related problems are anticipated with theirexpression in plants. It may, however, be advantageous to modify thegenes to include codons preferred in the appropriate target plantspecies. Fusions of the kind described below can be made to any desiredpromoter with or without modification (e.g. for optimized translationalinitiation in plants or for enhanced expression).

Expression behind the 35S Promoter

Each of the four pyrrolnitrin ORFs is transferred to pBluescript KS IIfor further manipulation. This is done by PCR amplification usingprimers homologous to each end of each gene and which additionallyinclude a restriction site to facilitate the transfer of the amplifiedfragments to the pBluescript vector. For ORF1, the aminoterminal primerincludes a SalI site and the carboxyterminal primer a NotI site.Similarly for ORF2, the aminoterminal primer includes a Sail site andthe carboxyterminal primer a NotI site. For ORF3, the aminoterminalprimer includes a NotI site and the carboxyterminal primer an XhoI site.Similarly for ORF4, the aminoterminal primer includes a NotI site andthe carboxyterminal primer an XhoI site. Thus, the amplified fragmentsare cleaved with the appropriate restriction enzymes (chosen becausethey do not cleave within the ORF) and are then ligated intopBluescript, also correspondingly cleaved. The cloning of the individualORFs in pBluescript facilitates their subsequent manipulation.

Destruction of internal restriction sites which are required for furtherconstruction is undertaken using the procedure of "inside-outside PCR"(Innes et al. PCR Protocols: A guide to methods and applications.Academic Press, New York (1990)). Unique restriction sites sought ateither side of the site to be destroyed (ideally between 100 and 500 bpfrom the site to be destroyed) and two separate amplifications are setup. One extends from the unique site left of the site to be destroyedand amplifies DNA up to the site to be destroyed with an amplifyingoligonucleotide which spans this site and incorporates an appropriatebase change. The second amplification extends from the site to bedestroyed up to the unique site rightwards of the site to be destroyed.The oligonucleotide spanning the site to be destroyed in this secondreaction incorporates the same base change as in the first amplificationand ideally shares an overlap of between 10 and 25 nucleotides with theoligonucleotide from the first reaction. Thus the products of bothreactions share an overlap which incorporates the same base change inthe restriction site corresponding to that made in each amplification.Following the two amplifications, the amplified products are gelpurified (to remove the four oligonucleotide primers used), mixedtogether and reamplified in a PCR reaction using the two primersspanning the unique restriction sites. In this final PCR reaction theoverlap between the two amplified fragments provides the primingnecessary for the first round of synthesis. The product of thisreactions extends from the leftwards unique restriction site to therightwards unique restriction site and includes the modified restrictionsite located internally. This product can be cleaved with the uniquesites and inserted into the unmodified gene at the appropriate locationby replacing the wild-type fragment.

To render ORF1 free of the first of its two internal SphI sitesoligonucleotides spanning and homologous to the unique XmaI and EspI aredesigned. The XmaI oligonucleotide is used in a PCR reaction togetherwith an oligonucleotide spanning the first SphI site and which includesthe sequence . . . CCCCCTCATGC . . . (lower strand, SEQ ID NO:15), thusintroducing a base change into to SphI site. A second PCR reactionutilizes an oligonucleotide spanning the SphI site (upper strand)incorporating the sequence . . . GCATGACAGGGGG . . . (SEQ ID NO:16) andis used in combination with the EspI site-spanning oligonucleotide. Thetwo products are gel purified and themselves amplified with the XmaI andEspI-spanning oligonucleotides and the resultant fragment is cleavedwith XmaI and EspI and used to replace the native fragment in the ORF1clone. According to the above description, the modified SphI site isGCATGA and does not cause a codon change. Other changes in this site arepossible (i.e. changing the second nucleotide to a G, T, or A) withoutcorrupting amino acid integrity.

A similar strategy is used to destroy the second SphI site in ORF 1. Inthis case, EspI is a suitable leftwards-located restriction site, andthe rightwards-located restriction site is PstI, located close to the 3'end of the gene or alternatively SstI which is not found in the ORFsequence, but immediately adjacent in the pBluescript polylinker. Inthis case an appropriate oligonucleotide is one which spans this site,or alternatively one of the available and pBluescript sequencingprimers. This SphI site is modified to GAATGC or GCATGT or GAATGT. Eachof these changes destroys the site without causing a codon change.

To render ORF2 free of its single SphI site a similar procedure is used.Leftward restriction sites are provided by PstI or MluI, and a suitablerightwards restriction site is provided by SstI in the pBluescriptpolylinker. In this case the site is changed to GCTTGC, GCATGC orGCTTGT; these changes maintain amino acid integrity.

ORF3 has no internal SphI sites.

In the case of ORF4, PstI provides a suitable rightwards unique site,but there is no suitable site located leftwards of the single SphI siteto be changed. In this case a restriction site in the pBluescriptpolylinker can be used to the same effect as already described above.The SphI site is modified to GGATGC, GTATGC, GAATGC, or GCATGT etc.

The removal of SphI sites from the pyrrolnitrin biosynthetic genes asdescribed above facilitates their transfer to the pCGN176ISENX vector byamplification using an aminoterminal oligonucleotide primer whichincorporates an SphI site at the ATG and a carboxyterminal primer whichincorporates a restriction site not found in the gene being amplified.The resultant amplified fragment is cleaved with SphI and thecarboxyterminal enzyme and cloned into pCGN1761SENX. Suitablerestriction enzyme sites for incorporation into the carboxyterminalprimer are NotI (for all four ORFs), XhoI (for ORF3 and ORF4), and EcoRI(for ORF4). Given the requirement for the nucleotide C at position 6within the SphI recognition site, in some cases the second codon of theORF may require changing so as to start with the nucleotide C. Thisconstruction fuses each ORF at its ATG to the SphI sites of thetranslation-optimized vector pCGN1761SENX in operable linkage to thedouble 35S promoter. After construction is complete the final geneinsertions and fusion points are resequenced to ensure that no undesiredbase changes have occurred.

By utilizing an aminoterminal oligonucleotide primer which incorporatesan NcoI site at its ATG instead of an SphI site, ORFs 1-4 can also beeasily cloned into to the translation-optimized vector pCGN1761NENX.None of the four pyrrolnitrin biosynthetic gene ORFs carry an NcoI siteand consequently there is no requirement in this case to destroyinternal restriction sites. Primers for the carboxyterminus of the geneare designed as described above and the cloning is undertaken in asimilar fashion. Given the requirement for the nucleotide G at position6 within the NcoI recognition site, in some cases the second codon ofthe ORF may require changing so as to start with the nucleotide G. Thisconstruction fuses each ORF at its ATG to the NcoI site of pCGN1761NENXin operable linkage to the double 35S promoter.

The expression cassettes of the appropriate pCGN1761-derivative vectorsare transferred to transformation vectors. Where possible multipleexpression cassettes are transferred to a single transformation vectorso as to reduce the number of plant transformations and crosses betweentransformants which may be required to produce plants expressing allfour ORFs and thus producing pyrrolnitrin.

Expression behind 35S with Chloroplast Targeting

The pyrrolnitrin ORFs 1-4 amplified using oligonucleotides carrying anSphI site at their aminoterminus are cloned into the 35S-chloroplasttargeted vector pCGN 1761/CT. The fusions are made to the SphI sitelocated at the cleavage site of the rbcS transit peptide. The expressioncassettes thus created are transferred to appropriate transformationvectors (see above) and used to generate transgenic plants. Astryptophan, the precursor for pyrrolnitrin biosynthesis, is synthesizedin the chloroplast, it may be advantageous to express the biosyntheticgenes for pyrrolnitrin in the chloroplast to ensure a ready supply ofsubstrate. Transgenic plants expressing all four ORFs will target allfour gene products to the chloroplast and will thus synthesizepyrrolnitrin in the chloroplast.

Expression behind rbcS with Chloroplast Targeting

The pyrrolnitrin ORFs 1-4 amplified using oligonucleotides carrying anSphI site at their aminoterminus are cloned into the rbcS-chloroplasttargeted vector pCGN1761rbcS/CT. The fusions are made to the SphI sitelocated at the cleavage site of the rbcS transit peptide. The expressioncassettes thus created are transferred to appropriate transformationvectors (see above) and used to generate transgenic plants. Astryptophan, the precursor for pyrrolnitrin biosynthesis, is synthesizedin the chloroplast, it may be advantageous to express the biosyntheticgenes for pyrrolnitrin in the chloroplast to ensure a ready supply ofsubstrate. Transgenic plants expressing all four ORFs will target allfour gene products to the chloroplast and will thus synthesizepyrrolnitrin in the chloroplast. The expression of the four ORFs will,however, be light induced.

Example 42

Expression of Soraphen in Transgenic Plants

Clone p98/1 contains the entirety of the soraphen biosynthetic gene ORF1which encodes five biosynthetic modules for soraphen biosynthesis. Thepartially sequenced ORF2 contains the remaining three modules, andfurther required for soraphen biosynthesis is the soraphen methylaselocated on the same operon.

Soraphen ORF1 is manipulated for expression in transgenic plants in thefollowing manner. A DNA fragment is amplified from the aminoterminus ofORF1 using PCR and p98/1 as template. The 5' oligonucleotide primerincludes either an SphI site or an NcoI site at the ATG for cloning intothe vectors pCGN 1761SENX or pCGNNENX respectively. Further, the 5'oligonucleotide includes either the base C (for SphI cloning) or thebase G (for NcoI cloning) immediately after the ATG, and thus the secondamino acid of the protein is changed either to a histidine or anaspartate (other amino acids can be selected for position 2 byadditionally changing other bases of the second codon). The 3'oligonucleotide for the amplification is located at the first BglII siteof the ORF and incorporates a distal EcoRI site enabling the amplifiedfragment to be cleaved with SphI (or NcoI) and EcoRI, and then clonedinto pCGN1761SENX (or pCGN1761NENX). To facilitate cleavage of theamplified fragments, each oligonucleotide includes several additionalbases at its 5' end. The oligonucleotides preferably have 12-30 bphomology to the ORF1 template, in addition to the required restrictionsites and additional sequences. This manipulation fuses theaminoterminal ˜112 amino acids of ORF1 at its ATG to the SphI or NcoIsites of the translation optimized vectors pCGN1761SENX or pCGN1761NENXin linkage to the double 35S promoter. The remainder of ORF1 is carriedon three BglII fragments which can be sequentially cloned into theunique BglII site of the above-detailed constructions. The introductionof the first of these fragments is no problem, and requires only thecleavage of the aminoterminal construction with BglII followed byintroduction of the first of these fragments. For the introduction ofthe two remaining fragments, partial digestion of the aminoterminalconstruction is required (since this construction now has an additionalBglII site), followed by introduction of the next BglII fragment. Thus,it is possible to construct a vector containing the entire ˜25 kb ofsoraphen ORF1 in operable fusion to the 35S promoter.

An alternative approach to constructing the soraphen ORF1 by the fusionof sequential restriction fragments is to amplify the entire ORF usingPCR. Barnes (Proc. Natl. Acad. Sci USA 91: 2216-2220 (1994)) hasrecently described techniques for the high-fidelity amplification offragments by PCR of up to 35 kb, and these techniques can be applied toORF1. Oligonucleotides specific for each end of ORF1, with appropriaterestriction sites added are used to amplify the entire coding region,which is then cloned into appropriate sites in a suitable vector such aspCGN1761 or its derivatives. Typically after PCR amplification,resequencing is advised to ensure that no base changes have arisen inthe amplified sequence. Alternatively, a functional assay can be donedirectly in transgenic plants.

Yet another approach to the expression of the genes for polyketidebiosynthesis (such as soraphen) in transgenic plants is theconstruction, for expression in plants, of transcriptional units whichcomprise less than the usual complement of modules, and to provide theremaining modules on other transcriptional units. As it is believed thatthe biosynthesis of polyketide antibiotics such as soraphen is a processwhich requires the sequential activity of specific modules and that forthe synthesis of a specific molecule these activities should be providedin a specific sequence, it is likely that the expression of differenttransgenes in a plant carrying different modules may lead to thebiosynthesis of novel polyketide molecules because the sequentialenzymatic nature of the wild-type genes is determined by theirconfiguration on a single molecule. It is assumed that the localizationof five specific modules for soraphen biosynthesis on ORF1 isdeterminatory in the biosynthesis of soraphen, and that the expressionof, say three modules on one transgene and the other two on another,together with ORF2, may result in biosynthesis of a polyketide with adifferent molecular structure and possibly with a differentantipathogenic activity. This invention encompasses all such deviationsof module expression which may result in the synthesis in transgenicorganisms of novel polyketides.

Although specific construction details are only provided for ORF1 above,similar techniques are used to express ORF2 and the soraphen methylasein transgenic plants. For the expression of functional soraphen inplants it is anticipated that all three genes must be expressed and thisis done as detailed in this specification.

Fusions of the kind described above can be made to any desired promoterwith or without modification (e.g. for optimized translationalinitiation in plants or for enhanced expression). As the ORFs identifiedfor soraphen biosynthesis are around 70% GC rich it is not anticipatedthat the coding sequences should require modification to increase GCcontent for optimal expression in plants. It may, however, beadvantageous to modify the genes to include codons preferred in theappropriate target plant species.

Example 43

Expression of Phenazine in Transgenic Plants

The GC content of all the cloned genes encoding biosynthetic enzymes forphenazine synthesis is between 58 and 65% and consequently no AT-contentrelated problems are anticipated with their expression in plants(although it may be advantageous to modify the genes to include codonspreferred in the appropriate target plant species.). Fusions of the kinddescribed below can be made to any desired promoter with or withoutmodification (e.g. for optimized translational initiation in plants orfor enhanced expression).

Expression behind the 35S Promoter

Each of the three phenazine ORFs is transferred to pBluescript SK II forfurther manipulation. The phzB ORF is transferred as an EcoRI-BglIIfragment cloned from plasmid pLSP18-6H3del3 containing the entirephenazine operon. This fragment is transferred to the EcoRI-BamHI sitesof pBluescript SK II. The phzC ORF is transferred from pLSP18-6H3de13 asan XhoI-ScaI fragment cloned into the XhoI-SmaI sites of pBluescript IISK. The phzD ORF is transferred from pLSP18-6H3del3 as a BglII-HindllIfragment into the BamHI-HindllI sites of pBluescript II SK.

Destruction of internal restriction sites which are required for furtherconstruction is undertaken using the procedure of "inside-outside PCR"described above (Innes et at. supra). In the case of the phzB ORF twoSphI sites are destroyed (one site located upstream of the ORF is leftintact). The first of these is destroyed using the unique restrictionsites EcoRI (left of the SphI site to be destroyed) and BclI (right ofthe SphI site). For this manipulation to be successful, the DNA to beBclI cleaved for the final assembly of the inside-outside PCR productmust be produced in a dam-minus E. coli host such as SCS110(Stratagene). For the second phzB SphI sites, the selected uniquerestriction sites are PstI and SpeI, the latter being beyond the phzBORF in the pBluescript polylinker. The phzC ORF has no internal SphIsites, and so this procedure is not required for phzC. The phzD ORF,however, has a single SphI site which can be removed using the uniquerestriction sites XmaI and HindlII (the XmaI/SmaI site of thepBluescript polylinker is no longer present due to the insertion of theORF between the BamHI and HindIII sites).

The removal of SphI sites from the phenazine biosynthetic genes asdescribed above facilitates their transfer to the pCGN1761SENX vector byamplification using an aminoterminal oligonucleotide primer whichincorporates an SphI site at the ATG and a carboxyterminal primer whichincorporates a restriction site not found in the gene being amplified.The resultant amplified fragment is cleaved with SphI and thecarboxyterminal enzyme and cloned into pCGN1761SENX. Suitablerestriction enzyme sites for incorporation into the carboxyterminalprimer are EcoRI and NotI (for all three ORFs; NotI will need checkingwhen sequence complete), and XhoI (for phzB and phzD). Given therequirement for the nucleotide C at position 6 within the SphIrecognition site, in some cases the second codon of the ORF may requirechanging so as to start with the nucleotide C. This construction fuseseach ORF at its ATG to the SphI sites of the translation-optimizedvector pCGN1761SENX in operable linkage to the double 35S promoter.After construction is complete the final gene insertions and fusionpoints are resequenced to ensure that no undesired base changes haveoccurred.

By utilizing an aminoterminal oligonucleotide primer which incorporatesan NcoI site at its ATG instead of an SphI site, the three phz ORFs canalso be easily cloned into to the translation-optimized vectorpCGN1761NENX. None of the three phenazine biosynthetic gene ORFs carryan NcoI site and consequently there is no requirement in this case todestroy internal restriction sites. Primers for the carboxyterminus ofthe gene are designed as described above and the cloning is undertakenin a similar fashion. Given the requirement for the nucleotide G atposition 6 within the NcoI recognition site, in some cases the secondcodon of the ORF may require changing so as to start with the nucleotideG. This construction fuses each ORF at its ATG to the NcoI site ofpCGN1761NENX in operable linkage to the double 35S promoter.

The expression cassettes of the appropriate pCGN1761-derivative vectorsare transferred to transformation vectors. Where possible multipleexpression cassettes are transferred to a single transformation vectorso as to reduce the number of plant transformations and crosses betweentransformants which may be required to produce plants expressing allfour ORFs and thus producing phenazine.

Expression behind 35S with Chloroplast Targeting

The three phenazine ORFs amplified using oligonucleotides carrying anSphI site at their aminoterminus are cloned into the 35S-chloroplasttargeted vector pCGN1761/CT. The fusions are made to the SphI sitelocated at the cleavage site of the rbcS transit peptide. The expressioncassettes thus created are transferred to appropriate transformationvectors (see above) and used to generate transgenic plants. Aschorismate, the likely precursor for phenazine biosynthesis, issynthesized in the chloroplast, it may be advantageous to express thebiosynthetic genes for phenazine in the chloroplast to ensure a readysupply of substrate. Transgenic plants expressing all three ORFs willtarget all three gene products to the chloroplast and will thussynthesize phenazine in the chioroplast.

Expression behind rbcS with Chloroplast Targeting

The three phenazine ORFs amplified using oligonucleotides carrying anSphI site at their aminoterminus are cloned into the rbcS-chloroplasttargeted vector pCGN1761rbcS/CT. The fusions are made to the SphI sitelocated at the cleavage site of the rbcS transit peptide. The expressioncassettes thus created are transferred to appropriate transformationvectors (see above) and used to generate transgenic plants.Aschorismate, the likely precursor for phenazine biosynthesis, issynthesized in the chloroplast, it may be advantageous to express thebiosynthetic genes for phenazine in the chloroplast to ensure a readysupply of substrate. Transgenic plants expressing all three ORFs willtarget all four gene products to the chloroplast and will thussynthesize phenazine in the chloroplast. The expression of the threeORFs will, however, be light induced.

Example 44

Expression of the Non-Ribosomally Synthesized Peptide AntibioticGramicidin in Transgenic Plants

The three Bacillus brevis gramicidin biosynthetic genes grsA, grsB andgrsT have been previously cloned and sequenced (Turgay et al. Mol.Microbiol. 6: 529-546 (1992); Kraetzschmar et al. J. Bacteriol. 171:5422-5429 (1989)). They are 3296, 13358, and 770 bp in length,respectively. These sequences are also published as GenBank accessionnumbers X61658 and M29703. The manipulations described here can beundertaken using the publicly available clones published by Turgay etal. (supra) and Kraetzschmar et al. (supra), or alternatively from newlyisolated clones from Bacillus brevis isolated as described herein.

Each of the three ORFs grsA, grsB, and grsT is PCR amplified usingoligonucleotides which span the entire coding sequence. The leftward(upstream) oligonucleotide includes an SstI site and the rightward(downstream) oligonucleotide includes an XhoI site. These restrictionsites are not found within any of the three coding sequences and enablethe amplified products to be cleaved with SstI and XhoI for insertioninto the corresponding sites of pBluescript II SK. This generates theclones pBL-GRSa, pBLGRSb and pBLGRSt. The CG content of these genes liesbetween 35 and 38%. Ideally, the coding sequences encoding the threegenes may be remade using the techniques referred to in Section K,however it is possible that the unmodified genes may be expressed athigh levels in transgenic plants without encountering problems due totheir AT content. In any case it may be advantageous to modify the genesto include codons preferred in the appropriate target plant species.

The ORF grsA contains no SphI site and no NcoI site. This gene can bethus amplified from pBLGSRa using an aminoterminal oligonucleotide whichincorporates either an SphI site or an NcoI site at the ATG, and asecond carboxyterminal oligonucleotide which incorporates an XhoI site,thus enabling the amplification product to be cloned directly intopCGN1761SENX or pCGN1761NENX behind the double 35S promoter.

The ORF grsB contains no NcoI site and therefore this gene can beamplified using an aminoterminal oligonucleotide containing an NcoI sitein the same was as described above for the grsA ORF; the amplifiedfragment is cleaved with NcoI and XhoI and ligated into pCGN1761NENX.However, the grsB ORF contains three SphI sites and these are destroyedto facilitate the subsequent cloning steps. The sites are destroyedusing the "inside-outside" PCR technique described above. Unique cloningsites found within the grsB gene but not within pBluescript II SK areEcoN1, PflMI, and RsrII. Either EcoN1 or PflMI can be used together withRsrII to remove the first two sites and RsrII can be used together withthe ApaI site of the pBluescript polylinker to remove the third site.Once these sites have been destroyed (without causing a change in aminoacid), the entirety of the grsB ORF can be amplified using anaminoterminal oligonucleotide including an SphI site at the ATG and acarboxyterminal oligonucleotide incorporating an XhoI site. Theresultant fragment is cloned into pCGN1761SENX. In order to successfullyPCR-amplify fragments of such size, amplification protocols are modifiedin view of Barnes (1994, supra) who describes the high fidelityamplification of large DNA fragments. An alternative approach to thetransfer of the grsB ORF to pCGN1761SENX without necessitating thedestruction of the three SphI restriction sites involves the transfer tothe SphI and XhoI cloning sites of pCGN1761SENX of an aminoterminalfragment of grsB by amplification from the ATG of the gene using anaminoterminal oligonucleotide which incorporates a SphI site at the ATG,and a second oligonucleotide which is adjacent and 3' to the PflMI sitein the ORF and which includes an XhoI site. Thus the aminoterminalamplified fragment is cleaved with SphI and XhoI and cloned intopCGN1761SENX Subsequently the remaining portion of the grsB gene isexcised from pBLGRSb using PflMI and XhoI (which cute in the pBluescriptpolylinker) and cloned into the aminoterminal carrying constructioncleaved with PflMI and XhoI to reconstitute the gene.

The ORF grsT contains no SphI site and no NcoI site. This gene can bethus amplified from pBLGSRt using an aminoterminal oligonucleotide whichincorporates either an SphI site or an NcoI site at the initiating codonwhich is changed to ATG (from GTG) for expression in plants, and asecond carboxyterminal oligonucleotide which incorporates an XhoI site,thus enabling the amplification product to be cloned directly intopCGN1761 SENX or pCGN1761NENX behind the double 35S promoter.

Given the requirement for the nucleotide C at position 6 within the SphIrecognition site, and the requirement for the nucleotide G at position 6within the NcoI recognition site, in some cases the second codon of theORF may require changing so as to start with the appropriate nucleotide.

Transgenic plants are created which express all three gramicidinbiosynthetic genes as described elsewhere in the specification.Transgenic plants expressing all three genes synthesize gramicidin.

Example 45

Expression of the Ribosomally Synthesized Peptide Lantibiotic Epiderminin Transgenic Plants

The epiA ORF encodes the structural unit for epidermin biosynthesis andis approximately 420 bp in length (GenBank Accession No. X07840; Schnellet al, Nature 333: 276-278 (1988)). This gene can be subcloned using PCRtechniques from the plasmid pTu32 into pBluescript SK II usingoligonucleotides carrying the terminal restriction sites BamHI (5') andPstI (3'). The epiA gene sequence has a GC content of 27% and this canbe increased using techniques of gene synthesis referred to elsewhere inthis specification; this sequence modification may not be essential,however, to ensure high-level expression in plants. Subsequently theepiA ORF is transferred to the cloning vector pCGN1761SENX orpCGN1761NENX by PCR amplification of the gene using an aminoterminaloligonucelotide spanning the initiating methionine and carrying an SphIsite (for cloning into pCGN1761SENX) or an NcoI site (for cloning intopCGN1761NENX), together with a carboxyterminal oligonucleotide carryingan EcoRI, a NotI, or an XhoI site for cloning into either pCGNI761SENXor pCGN1761NENX. Given the requirement for the nucleotide C at position6 within the SphI recognition site, and the requirement for thenucleotide G at position 6 within the NcoI recognition site, in somecases the second codon of the ORF may require changing so as to startwith the appropriate nucleotide.

Using cloning techniques described in this specification or well knownin the art, the remaining genes of the epi operon (viz. epiB, epiC,epiD, epiQ, and epiP) are subcloned from plasmid pTu32 into pBluescriptSK II. These genes are responsible for the modification andpolymerization of the epiA-encoded structural unit and are described inKupke et al. (J. Bacteriol. 174: 5354-5361 (1992)) and Schnell et al.(Eur. J. Biochem. 204: 57-68 (1992)). The subcloned ORFs are manipulatedfor transfer to pCGN1761-derivative vectors as described above. Theexpression cassettes of the appropriate pCGN1761-derivative vectors aretransferred to transformation vectors. Where possible multipleexpression cassettes are transferred to a single transformation vectorso as to reduce the number of plant transformations and crosses betweentransformants which may be required to produce plants expressing allrequired ORFs and thus producing epidermin.

L. Analysis of Transgenic Plants for APS Accumulation Example 46

Analysis of APS Gene Expression

Expression of APS genes in transgenic plants can be analyzed usingstandard Northern blot techniques to assess the amount of APS mRNAaccumulating in tissues. Alternatively, the quantity of APS gene productcan be assessed by Western analysis using antisera raised to APSbiosynthetic gene products. Antisera can be raised using conventionaltechniques and proteins derived from the expression of APS genes in ahost such as E. coli. To avoid the raising of antisera to multiple geneproducts from E. coli expressing multiple APS genes from multiple ORFoperons, the APS biosynthetic genes can be expressed individually in E.coli. Alternatively, antisera can be raised to synthetic peptidesdesigned to be homologous or identical to known APS biosyntheticpredicted amino acid sequence. These techniques are well known in theart.

Example 47

Analysis of APS Production in Transgenic Plants

For each APS, known protocols are used to detect production of the APSin transgenic plant tissue. These protocols are available in theappropriate APS literature. For pyrrolnitrin, the procedure described inexample 11 is used, and for soraphen the procedure described in example17. For phenazine determination, the procedure described in example 18can be used. For non-ribosomal peptide antibiotics such as gramicidin S,an appropriate general technique is the assaying of ATP-PP_(i) exchange.In the case of gramicidin, the grsA gene can be assayed byphenylalanine-dependent ATP-PP_(i) exchange and the grsB gene can beassayed by proline, valine, ornithine, or leucine-dependent ATP-PP_(i)exchange. Alternative techniques are described by Gause & Brazhnikova(Lancet 247: 715 (1944)). For ribosomally synthesized peptideantibiotics isolation can be achieved by butanol extraction, dissolvingin methanol and diethyl ether, followed by chromatography as describedby Allgaier et al. for epidermin (Eur. Ju. Biochem. 160: 9-22 (1986)).For many APSs (e.g. pyrrolnitrin, gramicidin, phenazine) appropriatetechniques are provided in the Merck Index (Merck & Co., Rahway, N.J.(1989)).

M. Assay of Disease Resistance in Transgenic Plants

Transgenic plants expressing APS biosynthetic genes are assayed forresistance to phytopathogens using techniques well known inphytopathology. For foliar pathogens, plants are grown in the greenhouseand at an appropriate stage of development inoculum of a phytopathogenof interest is introduced at in an appropriate manner. For soil-bornephytopathogens, the pathogen is normally introduced into the soil beforeor at the time the seeds are planted. The choice of plant cultivarselected for introduction of the genes will have taken into accountrelative phytopathogen sensitivity. Thus, it is preferred that thecultivar chosen will be susceptible to most phytopathogens of interestto allow a determination of enhanced resistance.

Assay of Resistance to Foliar Phytopathogens

Example 48

Disease Resistance to Tobacco Foliar Phytopathogens

Transgenic tobacco plants expressing APS genes and shown to produce APScompound are subjected to the following disease tests.

Phytophthora parasitica/Black shank Assays for resistance toPhytophthora parasitica, the causative organism of black shank areperformed on six-week-old plants grown as described in Alexander et al.,Pro. Natl. Acad. Sci. USA 90: 7327-7331. Plants are watered, allowed todrain well, and then inoculated by applying 10 mL of a sporangiumsuspension (300 sporangia/mL) to the soil. Inoculated plants are kept ina greenhouse maintained at 23°-25° C. day temperature, and 20°-22° C.night temperature. The wilt index used for the assay is as follows: 0=nosymptoms; 1=some sign of wilting, with reduced turgidity; 2=clearwilting symptoms, but no rotting or stunting; 3=clear wilting symptomswith stunting, but no apparent stem rot; 4=severe wilting, with visiblestem rot and some damage to root system; 5=as for 4, but plants neardeath or dead, and with severe reduction of root system. All assays arescored blind on plants arrayed in a random design.

Pseudomonas syringae Pseudomonas syringae pv. tabaci (strain #551 ) isinjected into the two lower leaves of several 6-7 week old plants at aconcentration of 10⁶ or 3×10⁶ per ml in H₂ O. Six individual plants areevaluated at each time point. Pseudomonas tabaci infected plants arerated on a 5 point disease severity scale, 5=100% dead tissue, 0=nosymptoms. A T-test (LSD) is conducted on the evaluations for each dayand the groupings are indicated after the Mean disease rating value.Values followed by the same letter on that day of evaluation are notstatistically significantly different.

Cercospora nicotianae A spore suspension of Cercospora nicotianae (ATCC#18366) (100,000-150,000 spores per ml) is sprayed to imminent run-offon to the surface of the leaves. The plants were maintained in 100%humidity for five days. Thereafter the plants are misted with H₂ O 5-10times per day. Six individual plants were evaluated at each time point.Cercospora nicotianae was rated on a % leaf area showing diseasesymptoms basis. A T-test (LSD) is conducted on the evaluations for eachday and the groupings are indicated after the Mean disease rating value.Values followed by the same letter on that day of evaluation are notstatistically significantly different.

Statistical Analyses All tests include non-transgenic tobacco (sixplants per assay, or the same cultivar as the transgenic lines)(Alexander et at., Pro. Natl. Acad. Sci. USA 90: 7327-7331). PairwiseT-tests were performed to compare different genotype and treatmentgroups for each rating date.

Assay of Resistance to Soil-Borne Phytopathogens

Example 49

Resistance to Rhizoctonia solani

Plant assays to determine resistance to Rhizoctonia solani are conductedby planting or transplanting seeds or seedlings into naturally orartificially infested soil. To create artificially infested soil,millet, rice, oat, or other similar seeds are first moistened withwater, then autoclaved and inoculated with plugs of the fungalphytopathogen taken from an agar plate. When the seeds are fullyovergrown with the phytopathogen, they are air-dried and ground into apowder. The powder is mixed into soil at a rate experimentallydetermined to cause disease. Disease may be assessed by comparing standcounts, root lesions ratings, and shoot and root weights of transgenicand non-transgenic plants grown in the infested soil. The diseaseratings may also be compared to the ratings of plants grown under thesame conditions but without phytopathogen added to the soil.

Example 50

Resistance to Pseudomonas solanacearum

Plant assays to determine resistance to Pseudomonas solanacearum areconducted by planting or transplanting seeds or seedlings into naturallyor artificially infested soil. To create artificially infested soil,bacteria are grown in shake flask cultures, then mixed into the soil ata rate experimentally determined to cause disease. The roots of theplants may need to be slightly wounded to ensure disease development.Disease may be assessed by comparing stand counts, degree of wilting andshoot and root weights of transgenic and non-transgenic plants grown inthe infested soil. The disease ratings may also be compared to theratings of plants grown under the same conditions but withoutphytopathogen added to the soil.

Example 51

Resistance to Soil-Borne Fungi which are Vectors for Virus Transmission

Many soil-borne Polymyxa, Olpidium and Spongospora species are vectorsfor the transmission of viruses. These include (1) Polymyxa betae whichtransmits Beet Necrotic Yellow Vein Virus (the causative agent ofrhizomania disease) to sugar beet, (2) Polymyxa graminis which transmitsWheat Soil-Borne Mosaic Virus to wheat, and Barley Yellow Mosaic Virusand Barley Mild Mosaic Virus to barley, (3) Olpidium brassicae whichtransmits Tobacco Necrosis Virus to tobacco, and (4) Spongosporasubterranea which transmits Potato Mop Top Virus to potato. Seeds orplants expressing APSs in their roots (e.g. constitutively or under rootspecific expression) are sown or transplanted in sterile soil and fungalinocula carrying the virus of interest are introduced to the soil. Aftera suitable time period the transgenic plants are assayed for vitalsymptoms and accumulation of virus by ELISA and Northern blot. Controlexperiments involve no inoculation, and inoculation with fungus whichdoes not carry the virus under investigation. The transgenic plant linesunder analysis should ideally be susceptible to the virus in order totest the efficacy of the APS-based protection. In the case of virusessuch as Barley Mild Mosaic Virus which are both Polymyxa-transmitted andmechanically transmissible, a further control is provided by thesuccessful mechanical introduction of the virus into plants which areprotected against soil-infection by APS expression in roots.

Resistance to virus-transmitting fungi offered by expression of APSswill thus prevent virus infections of target crops thus improving planthealth and yield.

Example 52

Resistance to Nematodes

Transgenic plants expressing APSs are analyzed for resistance tonematodes. Seeds or plants expressing APSs in their roots (e.g.constitutively or under root specific expression) are sown ortransplanted in sterile soil and nematode inocula carrying areintroduced to the soil. Nematode damage is assessed at an appropriatetime point. Root knot nematodes such as Meloidogyne spp. are introducedto transgenic tobacco or tomato expressing APSs. Cyst nematodes such asHeterodera spp. are introduced to transgenic cereals, potato and sugarbeet. Lesion nematodes such as Pratylenchus spp. are introduced totransgenic soybean, alfalfa or com. Reniform nematodes such asRotylenchulus spp. are introduced to transgenic soybean, cotton, ortomato. Ditylenchus spp. are introduced to transgenic alfalfa. Detailedtechniques for screening for resistance to nematodes are provided inStarr (Ed.; Methods for Evaluating Plant Species for resistance to PlantParasitic Nematodes, Society of Nematologists, Hyattsville, Md. (1990))

Examples of Important Phytopathogens in Agricultural Crop Species

Example 53

Disease Resistance in Maize

Transgenic maize plants expressing APS genes and shown to produce APScompound are subjected to the following disease tests. Tests for eachphytopathogen are conducted according to standard phytopathologicalprocedures.

Leaf Diseases and Stalk Rots

(1) Northern Com Leaf Blight (Helminthosporium turcicum† syn.Exserohilum turcicum).

(2) Anthracnose (Colletotrichum graminicola†-same as for Stalk Rot)

(3) Southern Corn Leaf Blight (Helminthosporium maydis† syn. Bipolarismaydis).

(3) Eye Spot (Kabatiella zeae)

(4) Common Rust (Puccinia sorghi).

(4) Southern Rust (Puccinia polysora).

(5) Gray Leaf Spot (Cercospora zeae-maydis† and C. sorghi)

(6) Stalk Rots (a complex of two or more of the followingpathogens-Pythium aphanidermatum†-early, Erwiniachrysanthemi-zeae-early, Colletotrichum graminicola†, Diplodia maydis†,D. macrospora, Gibberella zeae†, Fusarium moniliforme†, Macrophominaphaseolina, Cephalosporium acremonium)

(7) Goss' Disease (Clavibacter nebraskanense)

Important-Ear Molds

(1) Gibberella Ear Rot (Gibberella zeae†-same as for Stalk Rot)Aspergillus flavus, A. parasiticus. Aflatoxin

(2) Diplodia Ear Rot (Diplodia maydis† and D. macrospora-same organismsas for Stalk Rot)

(3) Head Smut (Sphacelotheca reiliana-syn. Ustilago reiliana)

Example 54

Disease Resistance in Wheat

Transgenic wheat plants expressing APS genes and shown to produce APScompound are subjected to the following disease tests. Tests for eachpathogen are conducted according to standard phytopathologicalprocedures.

(1) Septoria Diseases (Septoria tritici, S. nodorum)

(2) Powdery Mildew (Erysiphe gaminis)

(3) Yellow Rust (Puccinia striiformis)

(4) Brown Rust (Puccinia recondita, P. hordei)

(5) Others-Brown Foot Rot/Seedling Blight (Fusariurn culmorum andFusarium roseum), Eyespot (Pseudocercosporella herpotrichoides),Take-All (Gaeumannomyces graminis)

(6) Viruses (barley yellow mosaic virus, barley yellow dwarf virus,wheat yellow mosaic virus).

N. Assay of Biocontrol Efficacy in Microbial Strains Expressing APSGenes Example 55

Protection of Cotton against Rhizoctonia solani

Assays to determine protection of cotton from infection caused byRhizoctonia solani are conducted by planting seeds treated with thebiocontrol strain in naturally or artificially infested soil. To createartificially infested soil, millet, rice, oat, or other similar seedsare first moistened with water, then autoclaved and inoculated withplugs of the fungal pathogen taken from an agar plate. When the seedsare fully overgrown with the pathogen, they are air-dried and groundinto a powder. The powder is mixed into soil at a rate experimentallydetermined to cause disease. This infested soil is put into pots, andseeds are placed in furrows 1.5 cm deep. The biocontrol strains aregrown in shake flasks in the laboratory. The cells are harvested bycentrifugation, resuspended in water, and then drenched over the seeds.Control plants are drenched with water only. Disease may be assessed 14days later by comparing stand counts and root lesions ratings of treatedand nontreated seedlings. The disease ratings may also be compared tothe ratings of seedlings grown under the same conditions but withoutpathogen added to the soil.

Example 56

Protection of Potato against Claviceps michiganese subsp. speedonicum

Claviceps michiganese subsp. speedonicum is the causal agent of potatoring rot disease and is typically spread before planting when "seed"potato tubers are knife cut to generate more planting material.Transmission of the pathogen on the surface of the knife results in theinoculation of entire "seed" batches. Assays to determine protection ofpotato from the causal agent of ring rot disease are conducted byinoculating potato seed pieces with both the pathogen and the biocontrolstrain. The pathogen is introduced by first cutting a naturally infectedtuber, then using the knife to cut other tubers into seed pieces. Next,the seed pieces are treated with a suspension of biocontrol bacteria orwater as a control. Disease is assessed at the end of the growing seasonby evaluating plant vigor, yield, and number of tubers infected withClavibacter.

O. Isolation of APSs from Organisms Expressing the Cloned Genes Example57

Extraction Procedures for APS Isolation

Active APSs can be isolated from the cells or growth medium of wild-typeof transformed strains that produces the APS. This can be undertakenusing known protocols for the isolation of molecules of knowncharacteristics.

For example, for APSs which contain multiple benzene rings (pyrrolnitrinand soraphen) cultures are grown for 24 h in 10 ml L broth at anappropriate temperature and then extracted with an equal volume of ethylacetate. The organic phase is recovered, allowed to evaporated undervacuum and the residue dissolved in 20 μl of methanol.

In the case of pyrrolnitrin a further procedure has been usedsuccessfully for the extraction of the active antipathogenic compoundfrom the growth medium of the transformed strain producing thisantibiotic. This is accomplished by extraction of the medium with 80%acetone followed by removal of the acetone by evaporation and a secondextraction with diethyl ether. The diethyl ether is removed byevaporation and the dried extract is resuspended in a small volume ofwater. Small aliquots of the antibiotic extract applied to small sterilefilter paper discs placed on an agar plate will inhibit the growth ofRhizoctonia solani, indicating the presence of the active antibioticcompound.

A preferred method for phenazine isolation is described by Thomashow etal. (Appl Environ Microbiol 56: 908-912 (1990)). This involvesacidifying cultures to pH 2.0 with HCl and extraction with benzene.Benzene fractions are dehydrated with Na₂ SO₄ and evaporated to dryness.The residue is redissolved in aqueous 5% NaHCO₃, reextracted with anequal volume of benzene, acidified, partitioned into benzene andredried.

For peptide antibiotics (which are typically hydrophobic) extractiontechniques using butanol, methanol, chloroform or hexane are suitable.In the case of gramicidin, isolation can be carried out according to theprocedure described by Cause & Brazhnikova (Lancet 247: 715 (1944)).

For epidermin, the procedure described by Allgaier et al. for epidermin(Eur. Ju. Biochem. 160: 9-22 (1986)) is suitable and involves butanolextraction, and dissolving in methanol and diethyl ether. For many APSs(e.g. pyrrolnitrin, gramicidin, phenazine) appropriate techniques areprovided in the Merck Index (Merck & Co., Rahway, N.J. (1989)).

P. Formulation and Use of Isolated Antibiotics

Antifungal formulations can be made using active ingredients whichcomprise either the isolated APSs or alternatively suspensions orconcentrates of cells which produce them. Formulations can be made inliquid or solid form.

Example 58

Liquid Formulation of Antifungal Compositions

In the following examples, percentages of composition are given byweight:

    ______________________________________                                        1. Emulsifiable concentrates:                                                                      a       b      c                                         ______________________________________                                        Active ingredient    20%     40%    50%                                       Calcium dodecylbenzenesulfonate                                                                     5%      8%     6%                                       Castor oil polyethlene glycol                                                                       5%     --     --                                        ether (36 moles of ethylene oxide)                                            Tributylphenol polyethylene glyco                                                                  --      12%     4%                                       ether (30 moles of ethylene oxide)                                            Cyclohexanone        --      15%    20%                                       Xylene mixture       70%     25%    20%                                       ______________________________________                                    

Emulsions of any required concentration can be produced from suchconcentrates by dilution with water.

    ______________________________________                                        2. Solutions:      a      b       c    d                                      ______________________________________                                        Active ingredient  80%    10%      5%  95%                                    Ethylene glycol monomethyl ether                                                                 20%    --      --   --                                     Polyethylene glycol 400                                                                          --     70%     --   --                                     N-methyl-2-pyrrolidone                                                                           --     20%     --   --                                     Epoxidised coconut oil                                                                           --     --       1%   5%                                    Petroleum distillate                                                                             --     --      94%  --                                     (boiling range 160-190°)                                               ______________________________________                                    

These solutions are suitable for application in the form of microdrops.

    ______________________________________                                        3. Granulates:       a       b                                                ______________________________________                                        Active ingredient    5%      10%                                              Kaolin               94%     --                                               Highly dispersed silicic acid                                                                      1%      --                                               Attapulgit           --      90%                                              ______________________________________                                    

The active ingredient is dissolved in methylene chloride, the solutionis sprayed onto the carrier, and the solvent is subsequently evaporatedoff in vacuo.

    ______________________________________                                        4. Dusts:            a       b                                                ______________________________________                                        Active ingredient    2%      5%                                               Highly dispersed silicic acid                                                                      1%      5%                                               Talcum               97%     --                                               Kaolin               --      90%                                              ______________________________________                                    

Ready-to-use dusts are obtained by intimately mixing the carriers withthe active ingredient.

Example 59

Solid Formulation of Antifungal Compositions

In the following examples, percentages of compositions are by weight.

    ______________________________________                                        1. Wettable powders: a        b      c                                        ______________________________________                                        Active ingredient    20%      60%    75%                                      Sodium lignosulfonate                                                                               5%       5%    --                                       Sodium lauryl sulfate                                                                               3%      --      5%                                      Sodium diisobutylnaphthalene sulfonate                                                             --        6%    10%                                      Octylphenol polyethylene glycol ether                                                              --        2%    --                                       (7-8 moles of ethylene oxide)                                                 Highly dispersed silicic acid                                                                       5%      27%    10%                                      Kaolin               67%      --     --                                       ______________________________________                                    

The active ingredient is thoroughly mixed with the adjuvants and themixture is thoroughly ground in a suitable mill, affording wettablepowders which can be diluted with water to give suspensions of thedesired concentrations.

    ______________________________________                                        2. Emulsifiable concentrate:                                                  ______________________________________                                        Active ingredient       10%                                                   Octylphenol polyethylene glycol ether                                                                  3%                                                   (4-5 moles of ethylene oxide)                                                 Calcium dodecylbenzenesulfonate                                                                        3%                                                   Castor oil polyglycol ether                                                                            4%                                                   (36 moles of ethylene oxide)                                                  Cyclohexanone           30%                                                   Xylene mixture          50%                                                   ______________________________________                                    

Emulsions of any required concentration can be obtained from thisconcentrate by dilution with water.

    ______________________________________                                        3. Dusts:          a      b                                                   ______________________________________                                        Active ingredient   5%     8%                                                 Talcum             95%    --                                                  Kaolin             --     92%                                                 ______________________________________                                    

Ready-to-use dusts are obtained by mixing the active ingredient with thecarriers, and grinding the mixture in a suitable mill.

    ______________________________________                                        4. Extruder granulate:                                                        ______________________________________                                        Active ingredient  10%                                                        Sodium lignosulfonate                                                                             2%                                                        Carboxymethylcellulose                                                                            1%                                                        Kaolin             87%                                                        ______________________________________                                    

The active ingredient is mixed and ground with the adjuvants, and themixture is subsequently moistened with water. The mixture is extrudedand then dried in a stream of air.

    ______________________________________                                        5. Coated granulate:                                                          ______________________________________                                        Active ingredient  3%                                                         Polyethylene glycol 200                                                                          3%                                                         Kaolin             94%                                                        ______________________________________                                    

The finely ground active ingredient is uniformly applied, in a mixer, tothe kaolin moistened with polyethylene glycol. Non-dusty coatedgranulates are obtained in this manner.

    ______________________________________                                        6. Suspension concentrate:                                                    ______________________________________                                        Active ingredient      40%                                                    Ethylene glycol        10%                                                    Nonylphenol polyethylene glycol                                                                       6%                                                    (15 moles of ethylene oxide)                                                  Sodium lignosulfonate  10%                                                    Carboxymethylcellulose  1%                                                    37% aqueous formaldehyde solution                                                                    0.2%                                                   Silicone oil in 75% aqueous emulsion                                                                 0.8%                                                   Water                  32%                                                    ______________________________________                                    

The finely ground active ingredient is intimately mixed with theadjuvants, giving a suspension concentrate from which suspensions of anydesire concentration can be obtained by dilution with water.

While the present invention has been described with reference tospecific embodiments thereof, it will be appreciated that numerousvariations, modifications, and embodiments are possible, andaccordingly, all such variations, modifications and embodiments are tobe regarded as being within the spirit and scope of the presentinvention.

    __________________________________________________________________________    SEQUENCE LISTING                                                              (1) GENERAL INFORMATION:                                                      (iii) NUMBER OF SEQUENCES: 22                                                 (2) INFORMATION FOR SEQ ID NO:1:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 7001 base pairs                                                   (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (ix) FEATURE:                                                                 (A) NAME/KEY: CDS                                                             (B) LOCATION: 423..2036                                                       (D) OTHER INFORMATION: /label=ORF1                                            /note= "Open Reading Frame #1 of DNA sequence"                                (ix) FEATURE:                                                                 (A) NAME/KEY: CDS                                                             (B) LOCATION: 2039..3121                                                      (D) OTHER INFORMATION: /label=ORF2                                            /note= "Open Reading Frame #2 of DNA sequence"                                (ix) FEATURE:                                                                 (A) NAME/KEY: CDS                                                             (B) LOCATION: 3167..4867                                                      (D) OTHER INFORMATION: /label=ORF3                                            /note= "Open Reading Frame #3 of DNA sequence"                                (ix) FEATURE:                                                                 (A) NAME/KEY: CDS                                                             (B) LOCATION: 4895..5983                                                      (D) OTHER INFORMATION: /label=ORF4                                            /note= "Open Reading Frame #4 of DNA sequence"                                (ix) FEATURE:                                                                 (A) NAME/KEY: misc.sub.-- feature                                             (B) LOCATION: 1..7001                                                         (D) OTHER INFORMATION: /note= "Four open reading frames                       (ORFs) were identified within this DNA sequence and are                       transcribed as a single message, as described in Examples                     10 and 12 of the specification."                                              (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:                                       GAATTCCGACAACGCCGAAGAAGCGCGGAACCGCTGAAAGAGGAGCAGGAACTGGAGCAA60                ACGCTGTCCCAGGTGATCGACAGCCTGCCACTGCGCATCGAGGGCCGATGAACAGCATTG120               GCAAAAGCTGGCGGTGCGCAGTGCGCGAGTGATCCGATCATTTTTGATCGGCTCGCCTCT180               TCAAAATCGGCGGTGGATGAAGTCGACGGCGGACTGATCAGGCGCAAAAGAACATGCGCC240               AAAACCTTCTTTTATAGCGAATACCTTTGCACTTCAGAATGTTAATTCGGAAACGGAATT300               TGCATCGCTTTTCCGGCAGTCTAGAGTCTCTAACAGCACATTGATGTGCCTCTTGCATGG360               ATGCACGAAGACTGGCGGCCTCCCCTCGTCACAGGCGGCCCGCCTTTGAAACAAGGAGTG420               TTATGAACAAGCCGATCAAGAATATCGTCATCGTGGGCGGCGGTACT467                            MetAsnLysProIleLysAsnIleValIleValGlyGlyGlyThr                                 151015                                                                        GCGGGCTGGATGGCCGCCTCGTACCTCGTCCGGGCCCTCCAACAGCAG515                           AlaGlyTrpMetAlaAlaSerTyrLeuValArgAlaLeuGlnGlnGln                              202530                                                                        GCGAACATTACGCTCATCGAATCTGCGGCGATCCCTCGGATCGGCGTG563                           AlaAsnIleThrLeuIleGluSerAlaAlaIleProArgIleGlyVal                              354045                                                                        GGCGAAGCGACCATCCCAAGTTTGCAGAAGGTGTTCTTCGATTTCCTC611                           GlyGluAlaThrIleProSerLeuGlnLysValPhePheAspPheLeu                              505560                                                                        GGGATACCGGAGCGGGAATGGATGCCCCAAGTGAACGGCGCGTTCAAG659                           GlyIleProGluArgGluTrpMetProGlnValAsnGlyAlaPheLys                              657075                                                                        GCCGCGATCAAGTTCGTGAATTGGAGAAAGTCTCCCGACCCCTCGCGC707                           AlaAlaIleLysPheValAsnTrpArgLysSerProAspProSerArg                              80859095                                                                      GACGATCACTTCTACCATTTGTTCGGCAACGTGCCGAACTGCGACGGC755                           AspAspHisPheTyrHisLeuPheGlyAsnValProAsnCysAspGly                              100105110                                                                     GTGCCGCTTACCCACTACTGGCTGCGCAAGCGCGAACAGGGCTTCCAG803                           ValProLeuThrHisTyrTrpLeuArgLysArgGluGlnGlyPheGln                              115120125                                                                     CAGCCGATGGAGTACGCGTGCTACCCGCAGCCCGGGGCACTCGACGGC851                           GlnProMetGluTyrAlaCysTyrProGlnProGlyAlaLeuAspGly                              130135140                                                                     AAGCTGGCACCGTGCCTGTCCGACGGCACCCGCCAGATGTCCCACGCG899                           LysLeuAlaProCysLeuSerAspGlyThrArgGlnMetSerHisAla                              145150155                                                                     TGGCACTTCGACGCGCACCTGGTGGCCGACTTCTTGAAGCGCTGGGCC947                           TrpHisPheAspAlaHisLeuValAlaAspPheLeuLysArgTrpAla                              160165170175                                                                  GTCGAGCGCGGGGTGAACCGCGTGGTCGATGAGGTGGTGGACGTTCGC995                           ValGluArgGlyValAsnArgValValAspGluValValAspValArg                              180185190                                                                     CTGAACAACCGCGGCTACATCTCCAACCTGCTCACCAAGGAGGGGCGG1043                          LeuAsnAsnArgGlyTyrIleSerAsnLeuLeuThrLysGluGlyArg                              195200205                                                                     ACGCTGGAGGCGGACCTGTTCATCGACTGCTCCGGCATGCGGGGGCTC1091                          ThrLeuGluAlaAspLeuPheIleAspCysSerGlyMetArgGlyLeu                              210215220                                                                     CTGATCAATCAGGCGCTGAAGGAACCCTTCATCGACATGTCCGACTAC1139                          LeuIleAsnGlnAlaLeuLysGluProPheIleAspMetSerAspTyr                              225230235                                                                     CTGCTGTGCGACAGCGCGGTCGCCAGCGCCGTGCCCAACGACGACGCG1187                          LeuLeuCysAspSerAlaValAlaSerAlaValProAsnAspAspAla                              240245250255                                                                  CGCGATGGGGTCGAGCCGTACACCTCCTCGATCGCCATGAACTCGGGA1235                          ArgAspGlyValGluProTyrThrSerSerIleAlaMetAsnSerGly                              260265270                                                                     TGGACCTGGAAGATTCCGATGCTGGGCCGGTTCGGCAGCGGCTACGTC1283                          TrpThrTrpLysIleProMetLeuGlyArgPheGlySerGlyTyrVal                              275280285                                                                     TTCTCGAGCCATTTCACCTCGCGCGACCAGGCCACCGCCGACTTCCTC1331                          PheSerSerHisPheThrSerArgAspGlnAlaThrAlaAspPheLeu                              290295300                                                                     AAACTCTGGGGCCTCTCGGACAATCAGCCGCTCAACCAGATCAAGTTC1379                          LysLeuTrpGlyLeuSerAspAsnGlnProLeuAsnGlnIleLysPhe                              305310315                                                                     CGGGTCGGGCGCAACAAGCGGGCGTGGGTCAACAACTGCGTCTCGATC1427                          ArgValGlyArgAsnLysArgAlaTrpValAsnAsnCysValSerIle                              320325330335                                                                  GGGCTGTCGTCGTGCTTTCTGGAGCCCCTGGAATCGACGGGGATCTAC1475                          GlyLeuSerSerCysPheLeuGluProLeuGluSerThrGlyIleTyr                              340345350                                                                     TTCATCTACGCGGCGCTTTACCAGCTCGTGAAGCACTTCCCCGACACC1523                          PheIleTyrAlaAlaLeuTyrGlnLeuValLysHisPheProAspThr                              355360365                                                                     TCGTTCGACCCGCGGCTGAGCGACGCTTTCAACGCCGAGATCGTCCAC1571                          SerPheAspProArgLeuSerAspAlaPheAsnAlaGluIleValHis                              370375380                                                                     ATGTTCGACGACTGCCGGGATTTCGTCCAAGCGCACTATTTCACCACG1619                          MetPheAspAspCysArgAspPheValGlnAlaHisTyrPheThrThr                              385390395                                                                     TCGCGCGATGACACGCCGTTCTGGCTCGCGAACCGGCACGACCTGCGG1667                          SerArgAspAspThrProPheTrpLeuAlaAsnArgHisAspLeuArg                              400405410415                                                                  CTCTCGGACGCCATCAAAGAGAAGGTTCAGCGCTACAAGGCGGGGCTG1715                          LeuSerAspAlaIleLysGluLysValGlnArgTyrLysAlaGlyLeu                              420425430                                                                     CCGCTGACCACCACGTCGTTCGACGATTCCACGTACTACGAGACCTTC1763                          ProLeuThrThrThrSerPheAspAspSerThrTyrTyrGluThrPhe                              435440445                                                                     GACTACGAATTCAAGAATTTCTGGTTGAACGGCAACTACTACTGCATC1811                          AspTyrGluPheLysAsnPheTrpLeuAsnGlyAsnTyrTyrCysIle                              450455460                                                                     TTTGCCGGCTTGGGCATGCTGCCCGACCGGTCGCTGCCGCTGTTGCAG1859                          PheAlaGlyLeuGlyMetLeuProAspArgSerLeuProLeuLeuGln                              465470475                                                                     CACCGACCGGAGTCGATCGAGAAAGCCGAGGCGATGTTCGCCAGCATC1907                          HisArgProGluSerIleGluLysAlaGluAlaMetPheAlaSerIle                              480485490495                                                                  CGGCGCGAGGCCGAGCGTCTGCGCACCAGCCTGCCGACAAACTACGAC1955                          ArgArgGluAlaGluArgLeuArgThrSerLeuProThrAsnTyrAsp                              500505510                                                                     TACCTGCGGTCGCTGCGTGACGGCGACGCGGGGCTGTCGCGCGGCCAG2003                          TyrLeuArgSerLeuArgAspGlyAspAlaGlyLeuSerArgGlyGln                              515520525                                                                     CGTGGGCCGAAGCTCGCAGCGCAGGAAAGCCTGTAGTGGAACGCACC2050                           ArgGlyProLysLeuAlaAlaGlnGluSerLeuMetGluArgThr                                 5305351                                                                       TTGGACCGGGTAGGCGTATTCGCGGCCACCCACGCTGCCGTGGCGGCC2098                          LeuAspArgValGlyValPheAlaAlaThrHisAlaAlaValAlaAla                              5101520                                                                       TGCGATCCGCTGCAGGCGCGCGCGCTCGTTCTGCAACTGCCGGGCCTG2146                          CysAspProLeuGlnAlaArgAlaLeuValLeuGlnLeuProGlyLeu                              253035                                                                        AACCGTAACAAGGACGTGCCCGGTATCGTCGGCCTGCTGCGCGAGTTC2194                          AsnArgAsnLysAspValProGlyIleValGlyLeuLeuArgGluPhe                              404550                                                                        CTTCCGGTGCGCGGCCTGCCCTGCGGCTGGGGTTTCGTCGAAGCCGCC2242                          LeuProValArgGlyLeuProCysGlyTrpGlyPheValGluAlaAla                              556065                                                                        GCCGCGATGCGGGACATCGGGTTCTTCCTGGGGTCGCTCAAGCGCCAC2290                          AlaAlaMetArgAspIleGlyPhePheLeuGlySerLeuLysArgHis                              707580                                                                        GGACATGAGCCCGCGGAGGTGGTGCCCGGGCTTGAGCCGGTGCTGCTC2338                          GlyHisGluProAlaGluValValProGlyLeuGluProValLeuLeu                              859095100                                                                     GACCTGGCACGCGCGACCAACCTGCCGCCGCGCGAGACGCTCCTGCAT2386                          AspLeuAlaArgAlaThrAsnLeuProProArgGluThrLeuLeuHis                              105110115                                                                     GTGACGGTCTGGAACCCCACGGCGGCCGACGCGCAGCGCAGCTACACC2434                          ValThrValTrpAsnProThrAlaAlaAspAlaGlnArgSerTyrThr                              120125130                                                                     GGGCTGCCCGACGAAGCGCACCTGCTCGAGAGCGTGCGCATCTCGATG2482                          GlyLeuProAspGluAlaHisLeuLeuGluSerValArgIleSerMet                              135140145                                                                     GCGGCCCTCGAGGCGGCCATCGCGTTGACCGTCGAGCTGTTCGATGTG2530                          AlaAlaLeuGluAlaAlaIleAlaLeuThrValGluLeuPheAspVal                              150155160                                                                     TCCCTGCGGTCGCCCGAGTTCGCGCAAAGGTGCGACGAGCTGGAAGCC2578                          SerLeuArgSerProGluPheAlaGlnArgCysAspGluLeuGluAla                              165170175180                                                                  TATCTGCAGAAAATGGTCGAATCGATCGTCTACGCGTACCGCTTCATC2626                          TyrLeuGlnLysMetValGluSerIleValTyrAlaTyrArgPheIle                              185190195                                                                     TCGCCGCAGGTCTTCTACGATGAGCTGCGCCCCTTCTACGAACCGATT2674                          SerProGlnValPheTyrAspGluLeuArgProPheTyrGluProIle                              200205210                                                                     CGAGTCGGGGGCCAGAGCTACCTCGGCCCCGGTGCCGTAGAGATGCCC2722                          ArgValGlyGlyGlnSerTyrLeuGlyProGlyAlaValGluMetPro                              215220225                                                                     CTCTTCGTGCTGGAGCACGTCCTCTGGGGCTCGCAATCGGACGACCAA2770                          LeuPheValLeuGluHisValLeuTrpGlySerGlnSerAspAspGln                              230235240                                                                     ACTTATCGAGAATTCAAAGAGACGTACCTGCCCTATGTGCTTCCCGCG2818                          ThrTyrArgGluPheLysGluThrTyrLeuProTyrValLeuProAla                              245250255260                                                                  TACAGGGCGGTCTACGCTCGGTTCTCCGGGGAGCCGGCGCTCATCGAC2866                          TyrArgAlaValTyrAlaArgPheSerGlyGluProAlaLeuIleAsp                              265270275                                                                     CGCGCGCTCGACGAGGCGCGAGCGGTCGGTACGCGGGACGAGCACGTC2914                          ArgAlaLeuAspGluAlaArgAlaValGlyThrArgAspGluHisVal                              280285290                                                                     CGGGCTGGGCTGACAGCCCTCGAGCGGGTCTTCAAGGTCCTGCTGCGC2962                          ArgAlaGlyLeuThrAlaLeuGluArgValPheLysValLeuLeuArg                              295300305                                                                     TTCCGGGCGCCTCACCTCAAATTGGCGGAGCGGGCGTACGAAGTCGGG3010                          PheArgAlaProHisLeuLysLeuAlaGluArgAlaTyrGluValGly                              310315320                                                                     CAAAGCGGCCCCGAAATCGGCAGCGGGGGGTACGCGCCCAGCATGCTC3058                          GlnSerGlyProGluIleGlySerGlyGlyTyrAlaProSerMetLeu                              325330335340                                                                  GGTGAGCTGCTCACGCTGACGTATGCCGCGCGGTCCCGCGTCCGCGCC3106                          GlyGluLeuLeuThrLeuThrTyrAlaAlaArgSerArgValArgAla                              345350355                                                                     GCGCTCGACGAATCCTGATGCGCGCGACCCAGTGTTATCTCACAAGGAGAGTTTG3161                   AlaLeuAspGluSer                                                               360                                                                           CCCCCATGACTCAGAAGAGCCCCGCGAACGAACACGATAGCAATCAC3208                           MetThrGlnLysSerProAlaAsnGluHisAspSerAsnHis                                    1510                                                                          TTCGACGTAATCATCCTCGGCTCGGGCATGTCCGGCACCCAGATGGGG3256                          PheAspValIleIleLeuGlySerGlyMetSerGlyThrGlnMetGly                              15202530                                                                      GCCATCTTGGCCAAACAACAGTTTCGCGTGCTGATCATCGAGGAGTCG3304                          AlaIleLeuAlaLysGlnGlnPheArgValLeuIleIleGluGluSer                              354045                                                                        TCGCACCCGCGGTTCACGATCGGCGAATCGTCGATCCCCGAGACGTCT3352                          SerHisProArgPheThrIleGlyGluSerSerIleProGluThrSer                              505560                                                                        CTTATGAACCGCATCATCGCTGATCGCTACGGCATTCCGGAGCTCGAC3400                          LeuMetAsnArgIleIleAlaAspArgTyrGlyIleProGluLeuAsp                              657075                                                                        CACATCACGTCGTTTTATTCGACGCAACGTTACGTCGCGTCGAGCACG3448                          HisIleThrSerPheTyrSerThrGlnArgTyrValAlaSerSerThr                              808590                                                                        GGCATTAAGCGCAACTTCGGCTTCGTGTTCCACAAGCCCGGCCAGGAG3496                          GlyIleLysArgAsnPheGlyPheValPheHisLysProGlyGlnGlu                              95100105110                                                                   CACGACCCGAAGGAGTTCACCCAGTGCGTCATTCCCGAGCTGCCGTGG3544                          HisAspProLysGluPheThrGlnCysValIleProGluLeuProTrp                              115120125                                                                     GGGCCGGAGAGCCATTATTACCGGCAAGACGTCGACGCCTACTTGTTG3592                          GlyProGluSerHisTyrTyrArgGlnAspValAspAlaTyrLeuLeu                              130135140                                                                     CAAGCCGCCATTAAATACGGCTGCAAGGTCCACCAGAAAACTACCGTG3640                          GlnAlaAlaIleLysTyrGlyCysLysValHisGlnLysThrThrVal                              145150155                                                                     ACCGAATACCACGCCGATAAAGACGGCGTCGCGGTGACCACCGCCCAG3688                          ThrGluTyrHisAlaAspLysAspGlyValAlaValThrThrAlaGln                              160165170                                                                     GGCGAACGGTTCACCGGCCGGTACATGATCGACTGCGGAGGACCTCGC3736                          GlyGluArgPheThrGlyArgTyrMetIleAspCysGlyGlyProArg                              175180185190                                                                  GCGCCGCTCGCGACCAAGTTCAAGCTCCGCGAAGAACCGTGTCGCTTC3784                          AlaProLeuAlaThrLysPheLysLeuArgGluGluProCysArgPhe                              195200205                                                                     AAGACGCACTCGCGCAGCCTCTACACGCACATGCTCGGGGTCAAGCCG3832                          LysThrHisSerArgSerLeuTyrThrHisMetLeuGlyValLysPro                              210215220                                                                     TTCGACGACATCTTCAAGGTCAAGGGGCAGCGCTGGCGCTGGCACGAG3880                          PheAspAspIlePheLysValLysGlyGlnArgTrpArgTrpHisGlu                              225230235                                                                     GGGACCTTGCACCACATGTTCGAGGGCGGCTGGCTCTGGGTGATTCCG3928                          GlyThrLeuHisHisMetPheGluGlyGlyTrpLeuTrpValIlePro                              240245250                                                                     TTCAACAACCACCCGCGGTCGACCAACAACCTGGTGAGCGTCGGCCTG3976                          PheAsnAsnHisProArgSerThrAsnAsnLeuValSerValGlyLeu                              255260265270                                                                  CAGCTCGACCCGCGTGTCTACCCGAAAACCGACATCTCCGCACAGCAG4024                          GlnLeuAspProArgValTyrProLysThrAspIleSerAlaGlnGln                              275280285                                                                     GAATTCGATGAGTTCCTCGCGCGGTTCCCGAGCATCGGGGCTCAGTTC4072                          GluPheAspGluPheLeuAlaArgPheProSerIleGlyAlaGlnPhe                              290295300                                                                     CGGGACGCCGTGCCGGTGCGCGACTGGGTCAAGACCGACCGCCTGCAA4120                          ArgAspAlaValProValArgAspTrpValLysThrAspArgLeuGln                              305310315                                                                     TTCTCGTCGAACGCCTGCGTCGGCGACCGCTACTGCCTGATGCTGCAC4168                          PheSerSerAsnAlaCysValGlyAspArgTyrCysLeuMetLeuHis                              320325330                                                                     GCGAACGGCTTCATCGACCCGCTCTTCTCCCGGGGGCTGGAAAACACC4216                          AlaAsnGlyPheIleAspProLeuPheSerArgGlyLeuGluAsnThr                              335340345350                                                                  GCGGTGACCATCCACGCGCTCGCGGCGCGCCTCATCAAGGCGCTGCGC4264                          AlaValThrIleHisAlaLeuAlaAlaArgLeuIleLysAlaLeuArg                              355360365                                                                     GACGACGACTTCTCCCCCGAGCGCTTCGAGTACATCGAGCGCCTGCAG4312                          AspAspAspPheSerProGluArgPheGluTyrIleGluArgLeuGln                              370375380                                                                     CAAAAGCTTTTGGACCACAACGACGACTTCGTCAGCTGCTGCTACACG4360                          GlnLysLeuLeuAspHisAsnAspAspPheValSerCysCysTyrThr                              385390395                                                                     GCGTTCTCGGACTTCCGCCTATGGGACGCGTTCCACAGGCTGTGGGCG4408                          AlaPheSerAspPheArgLeuTrpAspAlaPheHisArgLeuTrpAla                              400405410                                                                     GTCGGCACCATCCTCGGGCAGTTCCGGCTCGTGCAGGCCCACGCGAGG4456                          ValGlyThrIleLeuGlyGlnPheArgLeuValGlnAlaHisAlaArg                              415420425430                                                                  TTCCGCGCGTCGCGCAACGAGGGCGACCTCGATCACCTCGACAACGAC4504                          PheArgAlaSerArgAsnGluGlyAspLeuAspHisLeuAspAsnAsp                              435440445                                                                     CCTCCGTATCTCGGATACCTGTGCGCGGACATGGAGGAGTACTACCAG4552                          ProProTyrLeuGlyTyrLeuCysAlaAspMetGluGluTyrTyrGln                              450455460                                                                     TTGTTCAACGACGCCAAAGCCGAGGTCGAGGCCGTGAGTGCCGGGCGC4600                          LeuPheAsnAspAlaLysAlaGluValGluAlaValSerAlaGlyArg                              465470475                                                                     AAGCCGGCCGATGAGGCCGCGGCGCGGATTCACGCCCTCATTGACGAA4648                          LysProAlaAspGluAlaAlaAlaArgIleHisAlaLeuIleAspGlu                              480485490                                                                     CGAGACTTCGCCAAGCCGATGTTCGGCTTCGGGTACTGCATCACCGGG4696                          ArgAspPheAlaLysProMetPheGlyPheGlyTyrCysIleThrGly                              495500505510                                                                  GACAAGCCGCAGCTCAACAACTCGAAGTACAGCCTGCTGCCGGCGATG4744                          AspLysProGlnLeuAsnAsnSerLysTyrSerLeuLeuProAlaMet                              515520525                                                                     CGGCTGATGTACTGGACGCAAACCCGCGCGCCGGCAGAGGTGAAAAAG4792                          ArgLeuMetTyrTrpThrGlnThrArgAlaProAlaGluValLysLys                              530535540                                                                     TACTTCGACTACAACCCGATGTTCGCGCTGCTCAAGGCGTACATCACG4840                          TyrPheAspTyrAsnProMetPheAlaLeuLeuLysAlaTyrIleThr                              545550555                                                                     ACCCGCATCGGCCTGGCGCTGAAGAAGTAGCCGCTCGACGACGACAT4887                           ThrArgIleGlyLeuAlaLeuLysLys                                                   560565                                                                        AAAAACGATGAACGACATTCAATTGGATCAAGCGAGCGTCAAGAAGCGT4936                         MetAsnAspIleGlnLeuAspGlnAlaSerValLysLysArg                                    1510                                                                          CCCTCGGGCGCGTACGACGCAACCACGCGCCTGGCCGCGAGCTGGTAC4984                          ProSerGlyAlaTyrAspAlaThrThrArgLeuAlaAlaSerTrpTyr                              15202530                                                                      GTCGCGATGCGCTCCAACGAGCTCAAGGACAAGCCGACCGAGTTGACG5032                          ValAlaMetArgSerAsnGluLeuLysAspLysProThrGluLeuThr                              354045                                                                        CTCTTCGGCCGTCCGTGCGTGGCGTGGCGCGGAGCCACGGGGCGGGCC5080                          LeuPheGlyArgProCysValAlaTrpArgGlyAlaThrGlyArgAla                              505560                                                                        GTGGTGATGGACCGCCACTGCTCGCACCTGGGCGCGAACCTGGCTGAC5128                          ValValMetAspArgHisCysSerHisLeuGlyAlaAsnLeuAlaAsp                              657075                                                                        GGGCGGATCAAGGACGGGTGCATCCAGTGCCCGTTTCACCACTGGCGG5176                          GlyArgIleLysAspGlyCysIleGlnCysProPheHisHisTrpArg                              808590                                                                        TACGACGAACAGGGCCAGTGCGTTCACATCCCCGGCCATAACCAGGCG5224                          TyrAspGluGlnGlyGlnCysValHisIleProGlyHisAsnGlnAla                              95100105110                                                                   GTGCGCCAGCTGGAGCCGGTGCCGCGCGGGGCGCGTCAGCCGACGTTG5272                          ValArgGlnLeuGluProValProArgGlyAlaArgGlnProThrLeu                              115120125                                                                     GTCACCGCCGAGCGATACGGCTACGTGTGGGTCTGGTACGGCTCCCCG5320                          ValThrAlaGluArgTyrGlyTyrValTrpValTrpTyrGlySerPro                              130135140                                                                     CTGCCGCTGCACCCGCTGCCCGAAATCTCCGCGGCCGATGTCGACAAC5368                          LeuProLeuHisProLeuProGluIleSerAlaAlaAspValAspAsn                              145150155                                                                     GGCGACTTTATGCACCTGCACTTCGCGTTCGAGACGACCACGGCGGTC5416                          GlyAspPheMetHisLeuHisPheAlaPheGluThrThrThrAlaVal                              160165170                                                                     TTGCGGATCGTCGAGAACTTCTACGACGCGCAGCACGCAACCCCGGTG5464                          LeuArgIleValGluAsnPheTyrAspAlaGlnHisAlaThrProVal                              175180185190                                                                  CACGCACTCCCGATCTCGGCCTTCGAACTCAAGCTCTTCGACGATTGG5512                          HisAlaLeuProIleSerAlaPheGluLeuLysLeuPheAspAspTrp                              195200205                                                                     CGCCAGTGGCCGGAGGTTGAGTCGCTGGCCCTGGCGGGCGCGTGGTTC5560                          ArgGlnTrpProGluValGluSerLeuAlaLeuAlaGlyAlaTrpPhe                              210215220                                                                     GGTGCCGGGATCGACTTCACCGTGGACCGGTACTTCGGCCCCCTCGGC5608                          GlyAlaGlyIleAspPheThrValAspArgTyrPheGlyProLeuGly                              225230235                                                                     ATGCTGTCACGCGCGCTCGGCCTGAACATGTCGCAGATGAACCTGCAC5656                          MetLeuSerArgAlaLeuGlyLeuAsnMetSerGlnMetAsnLeuHis                              240245250                                                                     TTCGATGGCTACCCCGGCGGGTGCGTCATGACCGTCGCCCTGGACGGA5704                          PheAspGlyTyrProGlyGlyCysValMetThrValAlaLeuAspGly                              255260265270                                                                  GACGTCAAATACAAGCTGCTCCAGTGTGTGACGCCGGTGAGCGAAGGC5752                          AspValLysTyrLysLeuLeuGlnCysValThrProValSerGluGly                              275280285                                                                     AAGAACGTCATGCACATGCTCATCTCGATCAAGAAGGTGGGCGGCATC5800                          LysAsnValMetHisMetLeuIleSerIleLysLysValGlyGlyIle                              290295300                                                                     CTGCGCCGCGCGACCGACTTCGTGCTGTTCGGGCTGCAGACCAGGCAG5848                          LeuArgArgAlaThrAspPheValLeuPheGlyLeuGlnThrArgGln                              305310315                                                                     GCCGCGGGGTACGACGTCAAAATCTGGAACGGAATGAAGCCGGACGGC5896                          AlaAlaGlyTyrAspValLysIleTrpAsnGlyMetLysProAspGly                              320325330                                                                     GGCGGCGCGTACAGCAAGTACGACAAGCTCGTGCTCAAGTACCGGGCG5944                          GlyGlyAlaTyrSerLysTyrAspLysLeuValLeuLysTyrArgAla                              335340345350                                                                  TTCTATCGAGGCTGGGTCGACCGCGTCGCAAGTGAGCGGTGATGCGTGA5993                         PheTyrArgGlyTrpValAspArgValAlaSerGluArg                                       355360                                                                        AGCCGAGCCGCTCTCGACCGCGTCGCTGCGCCAGGCGCTCGCGAACCTGGCGAGCGGCGT6053              GACGATCACGGCCTACGGCGCGCCGGGCCCGCTTGGGCTCGCGGCCACCAGCTTCGTGTC6113              GGAGTCGCTCTTTGCGAGGTATTCATGACTATCTGGCTGTTGCAACTCGTGCTGGTGATC6173              GCGCTCTGCAACGTCTGCGGCCGCATTGCCGAACGGCTCGGCCAGTGCGCGGTCATCGGC6233              GAGATCGCGGCCGGTTTGCTGTTGGGGCCGTCGCTGTTCGGCGTGATCGCACCGAGTTTC6293              TACGACCTGTTGTTCGGCCCCCAGGTGCTGTCAGCGATGGCGCAAGTCAGCGAAGTCGGC6353              CTGGTACTGCTGATGTTCCAGGTCGGCCTGCATATGGAGTTGGGCGAGACGCTGCGCGAC6413              AAGCGCTGGCGCATGCCCGTCGCGATCGCAGCGGGCGGGCTCGTCGCACCGGCCGCGATC6473              GGCATGATCGTCGCCATCGTTTCGAAAGGCACGCTCGCCAGCGACGCGCCGGCGCTGCCC6533              TATGTGCTCTTCTGCGGTGTCGCACTTGCGGTATCGGCGGTGCCGGTGATGGCGCGCATC6593              ATCGACGACCTGGAGCTCAGCGCCATGGTGGGCGCGCGGCACGCAATGTCTGCCGCGATG6653              CTGACGGATGCGCTCGGATGGATGCTGCTTGCAACGATTGCCTCGCTATCGAGCGGGCCC6713              GGCTGGGCATTTGCGCGCATGCTCGTCAGCCTGCTCGCGTATCTGGTGCTGTGCGCGCTG6773              CTGGTGCGCTTCGTGGTTCGACCGACCCTTGCGCGGCTCGCGTCGACCGCGCATGCGACG6833              CGCGACCGCTTGGCCGTGTTGTTCTGCTTCGTAATGTTGTCGGCACTCGCGACGTCGCTG6893              ATCGGATTCCATAGCGCTTTTGGCGCACTTGCCGCGGCGCTGTTCGTGCGCCGGGTGCCC6953              GGCGTCGCGAAGGAGTGGCGCGACAACGTCGAAGGTTTCGTCAAGCTT7001                          (2) INFORMATION FOR SEQ ID NO:2:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 538 amino acids                                                   (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: protein                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:                                       MetAsnLysProIleLysAsnIleValIleValGlyGlyGlyThrAla                              151015                                                                        GlyTrpMetAlaAlaSerTyrLeuValArgAlaLeuGlnGlnGlnAla                              202530                                                                        AsnIleThrLeuIleGluSerAlaAlaIleProArgIleGlyValGly                              354045                                                                        GluAlaThrIleProSerLeuGlnLysValPhePheAspPheLeuGly                              505560                                                                        IleProGluArgGluTrpMetProGlnValAsnGlyAlaPheLysAla                              65707580                                                                      AlaIleLysPheValAsnTrpArgLysSerProAspProSerArgAsp                              859095                                                                        AspHisPheTyrHisLeuPheGlyAsnValProAsnCysAspGlyVal                              100105110                                                                     ProLeuThrHisTyrTrpLeuArgLysArgGluGlnGlyPheGlnGln                              115120125                                                                     ProMetGluTyrAlaCysTyrProGlnProGlyAlaLeuAspGlyLys                              130135140                                                                     LeuAlaProCysLeuSerAspGlyThrArgGlnMetSerHisAlaTrp                              145150155160                                                                  HisPheAspAlaHisLeuValAlaAspPheLeuLysArgTrpAlaVal                              165170175                                                                     GluArgGlyValAsnArgValValAspGluValValAspValArgLeu                              180185190                                                                     AsnAsnArgGlyTyrIleSerAsnLeuLeuThrLysGluGlyArgThr                              195200205                                                                     LeuGluAlaAspLeuPheIleAspCysSerGlyMetArgGlyLeuLeu                              210215220                                                                     IleAsnGlnAlaLeuLysGluProPheIleAspMetSerAspTyrLeu                              225230235240                                                                  LeuCysAspSerAlaValAlaSerAlaValProAsnAspAspAlaArg                              245250255                                                                     AspGlyValGluProTyrThrSerSerIleAlaMetAsnSerGlyTrp                              260265270                                                                     ThrTrpLysIleProMetLeuGlyArgPheGlySerGlyTyrValPhe                              275280285                                                                     SerSerHisPheThrSerArgAspGlnAlaThrAlaAspPheLeuLys                              290295300                                                                     LeuTrpGlyLeuSerAspAsnGlnProLeuAsnGlnIleLysPheArg                              305310315320                                                                  ValGlyArgAsnLysArgAlaTrpValAsnAsnCysValSerIleGly                              325330335                                                                     LeuSerSerCysPheLeuGluProLeuGluSerThrGlyIleTyrPhe                              340345350                                                                     IleTyrAlaAlaLeuTyrGlnLeuValLysHisPheProAspThrSer                              355360365                                                                     PheAspProArgLeuSerAspAlaPheAsnAlaGluIleValHisMet                              370375380                                                                     PheAspAspCysArgAspPheValGlnAlaHisTyrPheThrThrSer                              385390395400                                                                  ArgAspAspThrProPheTrpLeuAlaAsnArgHisAspLeuArgLeu                              405410415                                                                     SerAspAlaIleLysGluLysValGlnArgTyrLysAlaGlyLeuPro                              420425430                                                                     LeuThrThrThrSerPheAspAspSerThrTyrTyrGluThrPheAsp                              435440445                                                                     TyrGluPheLysAsnPheTrpLeuAsnGlyAsnTyrTyrCysIlePhe                              450455460                                                                     AlaGlyLeuGlyMetLeuProAspArgSerLeuProLeuLeuGlnHis                              465470475480                                                                  ArgProGluSerIleGluLysAlaGluAlaMetPheAlaSerIleArg                              485490495                                                                     ArgGluAlaGluArgLeuArgThrSerLeuProThrAsnTyrAspTyr                              500505510                                                                     LeuArgSerLeuArgAspGlyAspAlaGlyLeuSerArgGlyGlnArg                              515520525                                                                     GlyProLysLeuAlaAlaGlnGluSerLeu                                                530535                                                                        (2) INFORMATION FOR SEQ ID NO:3:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 361 amino acids                                                   (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: protein                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:                                       ValGluArgThrLeuAspArgValGlyValPheAlaAlaThrHisAla                              151015                                                                        AlaValAlaAlaCysAspProLeuGlnAlaArgAlaLeuValLeuGln                              202530                                                                        LeuProGlyLeuAsnArgAsnLysAspValProGlyIleValGlyLeu                              354045                                                                        LeuArgGluPheLeuProValArgGlyLeuProCysGlyTrpGlyPhe                              505560                                                                        ValGluAlaAlaAlaAlaMetArgAspIleGlyPhePheLeuGlySer                              65707580                                                                      LeuLysArgHisGlyHisGluProAlaGluValValProGlyLeuGlu                              859095                                                                        ProValLeuLeuAspLeuAlaArgAlaThrAsnLeuProProArgGlu                              100105110                                                                     ThrLeuLeuHisValThrValTrpAsnProThrAlaAlaAspAlaGln                              115120125                                                                     ArgSerTyrThrGlyLeuProAspGluAlaHisLeuLeuGluSerVal                              130135140                                                                     ArgIleSerMetAlaAlaLeuGluAlaAlaIleAlaLeuThrValGlu                              145150155160                                                                  LeuPheAspValSerLeuArgSerProGluPheAlaGlnArgCysAsp                              165170175                                                                     GluLeuGluAlaTyrLeuGlnLysMetValGluSerIleValTyrAla                              180185190                                                                     TyrArgPheIleSerProGlnValPheTyrAspGluLeuArgProPhe                              195200205                                                                     TyrGluProIleArgValGlyGlyGlnSerTyrLeuGlyProGlyAla                              210215220                                                                     ValGluMetProLeuPheValLeuGluHisValLeuTrpGlySerGln                              225230235240                                                                  SerAspAspGlnThrTyrArgGluPheLysGluThrTyrLeuProTyr                              245250255                                                                     ValLeuProAlaTyrArgAlaValTyrAlaArgPheSerGlyGluPro                              260265270                                                                     AlaLeuIleAspArgAlaLeuAspGluAlaArgAlaValGlyThrArg                              275280285                                                                     AspGluHisValArgAlaGlyLeuThrAlaLeuGluArgValPheLys                              290295300                                                                     ValLeuLeuArgPheArgAlaProHisLeuLysLeuAlaGluArgAla                              305310315320                                                                  TyrGluValGlyGlnSerGlyProGluIleGlySerGlyGlyTyrAla                              325330335                                                                     ProSerMetLeuGlyGluLeuLeuThrLeuThrTyrAlaAlaArgSer                              340345350                                                                     ArgValArgAlaAlaLeuAspGluSer                                                   355360                                                                        (2) INFORMATION FOR SEQ ID NO:4:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 567 amino acids                                                   (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: protein                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:                                       MetThrGlnLysSerProAlaAsnGluHisAspSerAsnHisPheAsp                              151015                                                                        ValIleIleLeuGlySerGlyMetSerGlyThrGlnMetGlyAlaIle                              202530                                                                        LeuAlaLysGlnGlnPheArgValLeuIleIleGluGluSerSerHis                              354045                                                                        ProArgPheThrIleGlyGluSerSerIleProGluThrSerLeuMet                              505560                                                                        AsnArgIleIleAlaAspArgTyrGlyIleProGluLeuAspHisIle                              65707580                                                                      ThrSerPheTyrSerThrGlnArgTyrValAlaSerSerThrGlyIle                              859095                                                                        LysArgAsnPheGlyPheValPheHisLysProGlyGlnGluHisAsp                              100105110                                                                     ProLysGluPheThrGlnCysValIleProGluLeuProTrpGlyPro                              115120125                                                                     GluSerHisTyrTyrArgGlnAspValAspAlaTyrLeuLeuGlnAla                              130135140                                                                     AlaIleLysTyrGlyCysLysValHisGlnLysThrThrValThrGlu                              145150155160                                                                  TyrHisAlaAspLysAspGlyValAlaValThrThrAlaGlnGlyGlu                              165170175                                                                     ArgPheThrGlyArgTyrMetIleAspCysGlyGlyProArgAlaPro                              180185190                                                                     LeuAlaThrLysPheLysLeuArgGluGluProCysArgPheLysThr                              195200205                                                                     HisSerArgSerLeuTyrThrHisMetLeuGlyValLysProPheAsp                              210215220                                                                     AspIlePheLysValLysGlyGlnArgTrpArgTrpHisGluGlyThr                              225230235240                                                                  LeuHisHisMetPheGluGlyGlyTrpLeuTrpValIleProPheAsn                              245250255                                                                     AsnHisProArgSerThrAsnAsnLeuValSerValGlyLeuGlnLeu                              260265270                                                                     AspProArgValTyrProLysThrAspIleSerAlaGlnGlnGluPhe                              275280285                                                                     AspGluPheLeuAlaArgPheProSerIleGlyAlaGlnPheArgAsp                              290295300                                                                     AlaValProValArgAspTrpValLysThrAspArgLeuGlnPheSer                              305310315320                                                                  SerAsnAlaCysValGlyAspArgTyrCysLeuMetLeuHisAlaAsn                              325330335                                                                     GlyPheIleAspProLeuPheSerArgGlyLeuGluAsnThrAlaVal                              340345350                                                                     ThrIleHisAlaLeuAlaAlaArgLeuIleLysAlaLeuArgAspAsp                              355360365                                                                     AspPheSerProGluArgPheGluTyrIleGluArgLeuGlnGlnLys                              370375380                                                                     LeuLeuAspHisAsnAspAspPheValSerCysCysTyrThrAlaPhe                              385390395400                                                                  SerAspPheArgLeuTrpAspAlaPheHisArgLeuTrpAlaValGly                              405410415                                                                     ThrIleLeuGlyGlnPheArgLeuValGlnAlaHisAlaArgPheArg                              420425430                                                                     AlaSerArgAsnGluGlyAspLeuAspHisLeuAspAsnAspProPro                              435440445                                                                     TyrLeuGlyTyrLeuCysAlaAspMetGluGluTyrTyrGlnLeuPhe                              450455460                                                                     AsnAspAlaLysAlaGluValGluAlaValSerAlaGlyArgLysPro                              465470475480                                                                  AlaAspGluAlaAlaAlaArgIleHisAlaLeuIleAspGluArgAsp                              485490495                                                                     PheAlaLysProMetPheGlyPheGlyTyrCysIleThrGlyAspLys                              500505510                                                                     ProGlnLeuAsnAsnSerLysTyrSerLeuLeuProAlaMetArgLeu                              515520525                                                                     MetTyrTrpThrGlnThrArgAlaProAlaGluValLysLysTyrPhe                              530535540                                                                     AspTyrAsnProMetPheAlaLeuLeuLysAlaTyrIleThrThrArg                              545550555560                                                                  IleGlyLeuAlaLeuLysLys                                                         565                                                                           (2) INFORMATION FOR SEQ ID NO:5:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 363 amino acids                                                   (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: protein                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:                                       MetAsnAspIleGlnLeuAspGlnAlaSerValLysLysArgProSer                              151015                                                                        GlyAlaTyrAspAlaThrThrArgLeuAlaAlaSerTrpTyrValAla                              202530                                                                        MetArgSerAsnGluLeuLysAspLysProThrGluLeuThrLeuPhe                              354045                                                                        GlyArgProCysValAlaTrpArgGlyAlaThrGlyArgAlaValVal                              505560                                                                        MetAspArgHisCysSerHisLeuGlyAlaAsnLeuAlaAspGlyArg                              65707580                                                                      IleLysAspGlyCysIleGlnCysProPheHisHisTrpArgTyrAsp                              859095                                                                        GluGlnGlyGlnCysValHisIleProGlyHisAsnGlnAlaValArg                              100105110                                                                     GlnLeuGluProValProArgGlyAlaArgGlnProThrLeuValThr                              115120125                                                                     AlaGluArgTyrGlyTyrValTrpValTrpTyrGlySerProLeuPro                              130135140                                                                     LeuHisProLeuProGluIleSerAlaAlaAspValAspAsnGlyAsp                              145150155160                                                                  PheMetHisLeuHisPheAlaPheGluThrThrThrAlaValLeuArg                              165170175                                                                     IleValGluAsnPheTyrAspAlaGlnHisAlaThrProValHisAla                              180185190                                                                     LeuProIleSerAlaPheGluLeuLysLeuPheAspAspTrpArgGln                              195200205                                                                     TrpProGluValGluSerLeuAlaLeuAlaGlyAlaTrpPheGlyAla                              210215220                                                                     GlyIleAspPheThrValAspArgTyrPheGlyProLeuGlyMetLeu                              225230235240                                                                  SerArgAlaLeuGlyLeuAsnMetSerGlnMetAsnLeuHisPheAsp                              245250255                                                                     GlyTyrProGlyGlyCysValMetThrValAlaLeuAspGlyAspVal                              260265270                                                                     LysTyrLysLeuLeuGlnCysValThrProValSerGluGlyLysAsn                              275280285                                                                     ValMetHisMetLeuIleSerIleLysLysValGlyGlyIleLeuArg                              290295300                                                                     ArgAlaThrAspPheValLeuPheGlyLeuGlnThrArgGlnAlaAla                              305310315320                                                                  GlyTyrAspValLysIleTrpAsnGlyMetLysProAspGlyGlyGly                              325330335                                                                     AlaTyrSerLysTyrAspLysLeuValLeuLysTyrArgAlaPheTyr                              340345350                                                                     ArgGlyTrpValAspArgValAlaSerGluArg                                             355360                                                                        (2) INFORMATION FOR SEQ ID NO:6:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 28958 base pairs                                                  (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:                                       CGATCGCGTCGGCCTCGACACCGTCGAAGAGGTCACGCTCGAAGCTCCCCTCGCTCTCCC60                CTCTCAAGGCACCATTCTCATCCAGATCTCCGTCGGACCCATGGACGAGGCGGGACGAAG120               GTCGCTCTCCCTCCATGGCCGGACCGAGGACGCTCCTCAGGACGCCCCTTGGACGCGCCA180               CGCGAGCGGGTCGCTCGCTAAAGCTGCCCCCTCCCTCTCCTTCGATCTTCACGAATGGGC240               TCCTCCGGGGGGCACGCCGGTGGACACCCAAGGCTCTTACGCAGGCCTCGAAAGCGGGGG300               GCTCGCCTATGGGCCTCAGTTCCAGGGACTTCGCTCCGTCTGGAAGCGCGGCGACGAGCT360               CTTCGCCGAGGCCAAGCTCCCGGACGCAGGCGCCAAGGATGCCGCTCGGTTCGCCCTCCA420               CCCCGCCCTGTTCGACAGCGCCCTGCACGCGCTTGTCCTTGAAGACGAGCGGACGCCGGG480               CGTCGCTCTGCCCTTCTCGTGGAGAGGAGTCTCGCTGCGCTCCGTCGGCGCCACCACCCT540               GCGCGTGCGCTTCCATCGTCCGAATGGCAAGTCCTCCGTGTCGCTCCTCCTCGGCGACGC600               CGCAGGCGAGCCCCTCGCCTCGGTCCAAGCGCTCGCCACGCGCATCACGTCCCAGGAGCA660               GCTCCGCACCCAGGGAGCTTCCCTCCACGATGCTCTCTTCCGGGTTGTCTGGAGAGATCT720               GCCCAGCCCTACGTCGCTCTCTGAGGCCCCGAAGGGTGTCCTCCTAGAGACAGGGGGTCT780               CGACCTCGCGCTGCAGGCGTCTCTCGCCCGCTACGACGGTCTCGCTGCCCTCCGGAGCGC840               GCTCGACCAAGGCGCTTCGCCTCCGGGCCTCGTCGTCGTCCCCTTCATCGATTCGCCCTC900               TGGCGACCTCATAGAGAGCGCTCACAACTCCACCGCGCGCGCCCTCGCCTTGCTGCAAGC960               GTGGCTTGACGACGAACGCCTCGCCTCCTCGCGCCTCGTCCTGCTCACCCGACAGGCCAT1020              CGCAACCCACCCCGACGAGGACGTCCTCGACCTCCCTCACGCTCCTCTCTGGGGCCTTGT1080              GCGCACCGCGCAAAGCGAACACCCGGAGCTCCCTCTCTTCCTCGTCGACCTGGACCTCGG1140              TCAGGCCTCGGAGCGCGCCCTGCTCGGCGCGCTCGACACAGGAGAGCGTCAGCTCGCTCT1200              CCGCCATGGAAAATGCCTCGTCCCGAGGTTGGTGAATGCACGCTCGACAGAGGCGCTCAT1260              CGCGCCGAACGTATCCACGTGGAGCCTTCATATCCCGACCAAAGGCACCTTCGACTCGCT1320              CGCCCTCGTCGACGCTCCTCTAGCCCGTGCGCCCCTCGCACAAGGCCAAGTCCGCGTCGC1380              CGTGCACGCGGCAGGTCTCAACTTCCGCGATGTCCTCAACACCCTTGGCATGCTTCCGGA1440              CAACGCGGGGCCGCTCGGCGGCGAAGGCGCGGGCATTGTCACCGAAGTCGGCCCAGGTGT1500              TTCCCGATACACTGTAGGCGACCGGGTGATGGGCATCTTCCGCGGAGGCTTTGGCCCCAC1560              GGTCGTCGCCGACGCCCGCATGATCTGCCCCATCCCCGATGCCTGGTCCTTCGTCCAAGC1620              CGCCAGCGTCCCCGTCGTCTTTCTCACCGCCTACTATGGACTCGTCGATGTCGGGCATCT1680              CAAGCCCAATCAACGTGTCCTCATCCATGCGGCCGCAGGCGGCGTCGGTACTGCCGCCGT1740              CCAGCTCGCGCGCCACCTCGGCGCCGAAGTCTTCGCCACCGCCAGTCCAGGGAAGTGGGA1800              CGCTCTGCGCGCGCTCGGCTTCGACGATGCGCACCTCGCGTCCTCACGTGACCTGGAATT1860              CGAGCAGCATTTCCTGCGCTCCACACGAGGGCGCGGCATGGATGTCGTCCTCAACGCCTT1920              GGCGCGCGAGTTCGTCGACGCTTCGCTGCGTCTCCTGCCGAGCGGTGGAAGCTTTGTCGA1980              GATGGGCAAGACGGATATCCGCGAGCCCGACGCCGTAGGCCTCGCCTACCCCGGCGTCGT2040              TTACCGCGCCTTCGATCTCTTGGAGGCTGGACCGGATCGAATTCAAGAGATGCTCGCAGA2100              GCTGCTCGACCTGTTCGAGCGCGGCGTGCTTCGTCCGCCGCCCATCACGTCCTGGGACAT2160              CCGGCATGCCCCCCAGGCGTTCCGCGCGCTCGCTCAGGCGCGGCATATTGGAAAGTTCGT2220              CCTCACCGTTCCCGTCCCATCGATCCCCGAAGGCACCATCCTCGTCACGGGAGGCACCGG2280              CACGCTCGGCGCGCTCATCGCGCGCCACCTCGTCGCCAATCGCGGCGACAAGCACCTGCT2340              CCTCACCTCGCGAAAGGGTGCGAGCGCTCCGGGGGCCGAGGCATTGCGGAGCGAGCTCGA2400              AGCTCTGGGGGCTGCGGTCACGCTCGCCCGGTGCGACGCGGCCGATCCACGCGCGCTCCA2460              AGCCCTCTTGGACAGCATCCCGAGCGCTCACCCGCTCACGGCCGTCGTGCACGCCGCCGG2520              CGCCCTTGACGATGGGCTGATCAGCGACATGAGCCCCGAGCGCATCGACCGCGTCTTTGC2580              TCCCAAGCTCGACGCCGCTTGGCACTTGCATCAGCTCACCCAGGACAAGGCCGCTCGGGG2640              CTTCGTCCTCTTCTCGTCCGCCTCCGGCGTCCTCGGCGGTATGGGTCAATCCAACTACGC2700              GGGGGGCAATGCGTTCCTTGACGCGCTCGCGCATCACCGACGCGTCCATGGGCTCCCAGG2760              CTCCTCGCTCGCATGGGGCCATTGGGCCGAGCGCAGCGGAATGACCCGACAACCTCAGCG2820              GCGTCGATACCGCTCGCATGAGGCGCGCGGTCTCCGATCCATCGCCTCGGACGAGGGTCT2880              CGCCCTCTTCGATATGGCGCTCGGGCGCCCGGAGCCCGCGCTGGTCCCCGCCCGCTTCGA2940              CATGAACGCGCTCGGCGCGAAGGCCGACGGGCTACCCTCGATGTTCCAGGGTCTCGTCCG3000              CGCTCGCGTCGCGCGCAAGGTCGCCAGCAATAATGCCCTGGCCGCGTCGCTCACCCAGCG3060              CCTCGCCTCCCTCCCGCCCACCGACCGCGAGCGCATGCTGCTCGATCTCGTCCGCGCCGA3120              AGCCGCCATCGTCCTCGGCCTCGCCTCGTTCGAATCGCTCGATCCCCGTCGCCCTCTTCA3180              AGAGCTCGGTCTCGATTCCCTCATGGCCATCGAGCTCCGAAATCGACTCGCCGCCGCCAC3240              AGGCTTGCGACTCCAAGCCACCCTCCTCTTCGACCACCCGACGCCCGCCGCGCTCGCGAC3300              CCTGCTGCTCGGGAAGCTCCTCCAGCATGAAGCTGCCGATCCTCGCCCCTTGGCCGCAGA3360              GCTCGACAGGCTAGAGGCCACTCTCTCCGCGATAGCCGTGGACGCTCAAGCACGCCCGAA3420              GATCATATTACGCCTGCAATCCTGGTTGTCGAAGTGGAGCGACGCTCAGGCTGCCGACGC3480              TGGACCGATTCTCGGCAAGGATTTCAAGTCTGCTACGAAGGAAGAGCTCTTCGCTGCTTG3540              TGACGAAGCGTTCGGAGGCCTGGGTAAATGAATAACGACGAGAAGCTTGTCTCCTACCTA3600              CAGCAGGCGATGAATGAGCTTCAGCGTGCTCATCAGCCCCTCCGCGCGGTCGAAGAGAAG3660              GAGCACGAGCCCATCGCCATCGTGGCGATGAGCTGCCGCTTCCCGGGCGACGTGCGCACG3720              CCCGAGGATCTCTGGAAGCTCTTGCTCGATGGGAAAGATGCTATCTCCGACCTTCCCCCA3780              AACCGTGGTTGGAAGCTCGACGCGCTCGACGTCCACGGTCGCTCCCCAGTCCGAGAGGGA3840              GGCTTCTTCTACGACGCAGACGCCTTCGATCCGGCCTTCTTCGGGATCAGCCCACGCGAG3900              GCGCTCGCCATCGATCCCCAGCAGCGGCTCCTCCTCGAGATCTCATGGGAAGCCTTCGAG3960              CGTGCGGGCATCGACCCTGCCTCGCTCCAAGGGAGCCAAAGCGGCGTCTTCGTCGGCGTG4020              ATACACAACGACTACGACGCATTGCTGGAGAACGCAGCTGGCGAACACAAAGGATTCGTT4080              TCCACCGGCAGCACAGCGAGCGTCGCCTCCGGCCGGATCGCGTATACATTCGGCTTTCAA4140              GGGCCCGCCATCAGCGTGGACACGGCGTGCAGCTCCTCGCTCGTCGCGGTTCACCTCGCC4200              TGCCAGGCCCTGCGCCGTGGCGAATGCTCCCTGGCGCTCGCCGGCGGCGTGACCGTCATG4260              GCCACGCCAGCAGTCTTCGTCGCGTTCGATTCCGAGAGCGCGGGCGCCCCCGATGGTCGC4320              TGCAAGTCGTTCTCGGTGGAGGCCAACGGTTCGGGCTGGGCCGAGGGCGCCGGGATGCTC4380              CTGCTCGAGCGCCTCTCCGATGCCGTCCAAAACGGTCATCCCGTCCTCGCCGTCCTTCGA4440              GGCTCCGCCGTCAACCAGGACGGCCGGAGCCAAGGCCTCACCGCGCCCAATGGCCCTGCC4500              CAAGAGCGCGTCATCCGGCAAGCGCTCGACAGCGCGCGGCTCACTCCAAAGGACGTCGAC4560              GTCGTCGAGGCTCACGGCACGGGAACCACCCTCGGAGACCCCATCGAGGCACAGGCCATT4620              CTTGCCACCTATGGCGAGGCCCATTCCCAAGACAGACCCCTCTGGCTTGGAAGTCTCAAG4680              TCCAACCTGGGACATGCTCAGGCCGCGGCCGGCGTGGGAAGCGTCATCAAGATGGTGCTC4740              GCGTTGCAGCAAGGCCTCTTGCCCAAGACCCTCCATGCCCAGAATCCCTCCCCCCACATC4800              GACTGGTCTCCGGGCACGGTAAAGCTCCTGAACGAGCCCGTCGTCTGGACGACCAACGGG4860              CATCCTCGCCACGCCGGCGTCTCCGCCTTCGGCATCTCCGGCACCAACGCCCACGTCATC4920              CTCGAAGAGGCCCCCGCCATCGCCCGGGTCGAGCCCGCAGCGTCACAGCCCGCGTCCGAG4980              CCGCTTCCCGCAGCGTGGCCCGTGCTCCTGTCGGCCAAGAGCGAGGCGGCCGTGCGCGCC5040              CAGGCAAAGCGGCTCCGCGACCACCTCCTCGCCAAAAGCGAGCTCGCCCTCGCCGATGTG5100              GCCTATTCGCTCGCGACCACGCGCGCCCACTTCGAGCAGCGCGCCGCTCTCCTCGTCAAA5160              GGCCGCGACGAGCTCCTCTCCGCCCTCGATGCGCTGGCCCAAGGACATTCCGCCGCCGTG5220              CTCGGACGAAGCGGGGCCCCAGGAAAGCTCGCCGTCCTCTTCACGGGGCAAGGAAGCCAG5280              CGGCCCACCATGGGCCGCGGCCTCTACGACGTTTTCCCCGTCTTCCGGGACGCCCTCGAC5340              ACCGTCGGCGCCCACCTCGACCGCGAGCTCGACCGCCCCCTGCGCGACGTCCTCTTCGCT5400              CCCGACGGCTCCGAGCAGGCCGCGCGCCTCGAGCAAACCGCCTTCACCCAGCCGGCCCTG5460              TTTGCCCTCGAAGTCGCCCTCTTTCAGCTTCTACAATCCTTCGGTCTGAAGCCCGCTCTC5520              CTCCTCGGACACTCCATTGGCGAGCTCGTCGCCGCCCACGTCGCCGGCGTCCTTTCTCTC5580              CAGGACGGCTGCACCCTCGTCGCCGCCCGCGCAAAGCTCATGCAAGCGCTCCCACAAGGC5640              GGCGCCATGGTCACCCTCCGAGCCTCCGAGGAGGAAGTCCGCGACCTTCTCCAGCCCTAC5700              GAAGGCCGAGCTAGCCTCGCCGCCCTCAATGGGCCTCTCTCCACCGTCGTCGCTGGCGAT5760              GAAGACGCGGTGGTGGAGATCGCCCGCCAGGCCGAAGCCCTCGGACGAAAGACCACACGC5820              CTGCGCGTCAGCCACGCCTTCCATTCCCCGCACATGGACGGAATGCTCGACGACTTCCGC5880              CGCGTCGCCCAGAGCCTCACCTACCATCCCGCACGCATCCCCATCATCTCCAACGTCACC5940              GGCGCGCGCGCCACGGACCACGAGCTCGCCTCGCCCGACTACTGGGTCCGCCACGTTCGC6000              CACACCGTCCGCTTCCTCGACGGCGTACGTGCCCTTCACGCCGAAGGGGCACGTGTCTTT6060              CTCGAGCTCGGGCCTCACGCTGTCCTCTCCGCCCTTGCGCAAGACGCCCTCGGACAGGAC6120              GAAGGCACGTCGCCATGCGCCTTCCTTCCCACCCTCCGCAAGGGACGCGACGACGCCGAG6180              GCGTTCACCGCCGCGCTCGGCGCTCTCCACTCCGCAGGCATCACACCCGACTGGAGCGCT6240              TTCTTCGCCCCCTTCGCTCCACGCAAGGTCTCCCTCCCCACCTATGCCTTCCAGCGCGAG6300              CGCTTCTGGCCCGACGCCTCCAAGGCACCCGGCGCCGACGTCAGCCACCTTGCTCCGCTC6360              GAGGGGGGGCTCTGGCAAGCCATCGAGCGCGGGGACCTCGATGCGCTCAGCGGTCAGCTC6420              CACGTGGACGGCGACGAGCGGCGCGCCGCGCTCGCCCTGCTCCTTCCCACCCTCTCGAGC6480              TTTCGCCACGAGCGGCAAGAGCAGAGCACGGTCGACGCCTGGCGCTACCGTATCACCTGG6540              AAGCCTCTGACCACCGCCGAAACACCCGCCGACCTCGCCGGCACCTGGCTCGTCGTCGTG6600              CCGGCCGCTCTGGACGACGACGCGCTCCCCTCCGCGCTCACCGAGGCGCTCACCCGGCGC6660              GGCGCGCGCGTCCTCGCCTTGCGCCTGAGCCAGGCCCACCTGGACCGCGAGGCTCTCGCC6720              GAGCATCTGCGCCAGGCTTGCGCCGAGACCGCCCCGATTCGCGGCGTGCTCTCGCTCCTC6780              GCCCTCGACGAGCGCCCCCTCGCAGACCGTCCTGCCCTGCCCGCCGGACTCGCCCTCTCG6840              CTTTCTCTCGCTCAAGCCCTCGGCGACCTCGACCTCGAGGCGCCCTTGTGGTTCTTCACG6900              CGCGGCGCCGTCTCCATTGGACACTCTGACCCCCTCGCCCATCCCGCCCAGGCCATGACC6960              TGGGGCTTGGGCCGCGTCATCGGCCTCGAGCACCCCGACCGGTGGGGAGGTCTCGTCGAC7020              GTCTGCGCTGGGGTCGACGAGAGCGCCGTGGGCCGCTTGCTGCCGGCCCTCGCCGAGCGC7080              CACGACGAAGACCAGCTCGCTCTCCGCCCGGCCGGACTCTACGCTCGCCGCATCGTCCGC7140              GCCCCGCTCGGCGATGCGCCTCCCGCGCGCGACTTCACGCCCGGAGGCACCATTCTCATC7200              ACCGGCGGCACCGGCGCCATTGGCGCTCACGTCGCCCGATGGCTCGCTCGAAGAGGCGCT7260              CAGCACCTCGTCCTCATCAGCCGCCGAGGCGCCGAGGCCCCTGGCGCCTCGGAGCTCCAC7320              GACGAGCTCTCGGCCCTCGGCGCGCGCACCACCCTCGCCGCGTGCGATGTCGCCGACCGG7380              AATGCTGTCGCCACGCTTCTTGAGCAGCTCGACGCCGAAGGGTCGCAGGTCCGCGCCGTG7440              TTCCACGCGAGCGGCATCGAACACCACGCTCCGCTCGACGCCACCTCTTTCAGGGATCTC7500              GCCGAGGTTGTCTCCGGCAAGGTCGAAGGTGCAAAGCACCTCCACGACCTGCTCGGCTCT7560              CGACCCCTCGACGCCTTTGTTCTCTTTTCGTCCGGCGCGGCCGTCTGGGGCGGCGGACAG7620              CAAGGCGGCTACGCGGCCGCAAACGCCTTCCTCGACGCCCTTGCCGAGCATCGGCGCAGC7680              GCTGGATTGACAGCGACGTCGGTGGCCTGGGGCGCGTGGGGCGGCGGCGGCATGGCCACC7740              GATCAGGCGGCAGCCCACCTCCAACAGCGCGGTCTGTCGCGGATGGCCCCCTCGCTTGCC7800              CTGGCGGCGCTCGCGCTGGCTCTGGAGCACGACGAGACCACCGTCACCGTCGCCGACATC7860              GACTGGGCGCGCTTTGCGCCTTCGTTCAGCGCCGCTCGCCCCCGCCCGCTCCTGCGCGAT7920              TTGCCCGAGGCGCAGCGCGCTCTCGAGACCAGCGAAGGCGCGTCCTCCGAGCATGGCCCG7980              GCCCCCGACCTCCTCGACAAGCTCCGGAGCCGCTCGGAGAGCGAGCAGCTTCGTCTGCTC8040              GTCTCGCTGGTGCGCCACGAGACGGCCCTCGTCCTCGGCCACGAAGGCGCCTCCCATGTC8100              GACCCCGACAAGGGCTTCCTCGATCTCGGTCTCGATTCGCTCATGGCCGTCGAGCTTCGC8160              CGGCGCTTGCAACAGGCCACCGGCATCAAGCTCCCGGCCACCCTCGCCTTCGACCATCCC8220              TCTCCTCATCGAGTCGCGCTCTTCTTGCGCGACTCGCTCGCCCACGCCCTCGGCACGAGG8280              CTCTCCGTCGAGCCCGACGCCGCCGCGCTCCCGGCGCTTCGCGCCGCGAGCGACGAGCCC8340              ATCGCCATCGTCGGCATGGCCCTCCGCCTGCCGGGCGGCGTCGGCGATGTCGACGCTCTT8400              TGGGAGTTCCTGGCCCAGGGACGCGACGGCGTCGAGCCCATTCCAAAGGCCCGATGGGAT8460              GCCGCTGCGCTCTACGACCCCGACCCCGACGCCAAGACCAAGAGCTACGTCCGGCATGCC8520              GCCATGCTCGACCAGGTCGACCTCTTCGACCCTGCCTTCTTTGGCATCAGCCCCCGGGAG8580              GCCAAACACCTCGACCCCCAGCACCGCCTGCTCCTCGAATCTGCCTGGCAGGCCCTCGAA8640              GACGCCGGCATCGTCCCCCCCACCCTCAAGGATTCCCCCACCGGCGTCTTCGTCGGCATC8700              GGCGCCAGCGAATACGCATTGCGAGAGGCGAGCACCGAAGATTCCGACGCTTATGCCCTC8760              CAAGGCACCGCCGGGTCCTTTGCCGCGGGGCGCTTGGCCTACACGCTCGGCCTGCAAGGG8820              CCCGCGCTCTCGGTCGACACCGCCTGCTCCTCCTCGCTCGTCGCCCTCCACCTCGCCTGC8880              CAAGCCCTCCGACAGGGCGAGTGCAACCTCGCCCTCGCCGCGGGCGTCTCCGTCATGGCC8940              TCCCCCGAGGGCTTCGTCCTCCTTTCCCGCCTGCGCGCCTTGGCGCCCGACGGCCGCTCC9000              AAGACCTTCTCGGCCAACGCCGACGGCTACGGACGCGGAGAAGGCGTCATCGTCCTTGCC9060              CTCGAGCGGCTCGGTGACGCCCTCGCCCGAGGACACCGCGTCCTCGCCCTCGTCCGCGGC9120              ACCGCCATCAACCACGACGGCGCGTCGAGCGGTATCACCGCCCCCAACGGCACCTCCCAG9180              CAGAAGGTCCTCCGCGCCGCGCTCCACGACGCCCGCATCACCCCCGCCGACGTCGACGTC9240              GTCGAGTGCCATGGCACCGGCACCTCCTTGGGAGACCCCATCGAGGTGCAAGCCCTGGCC9300              GCCGTCTACGCCGACGGCAGACCCGCTGAAAAGCCTCTCCTTCTCGGCGCGCTCAAGACC9360              AACATCGGCCATCTCGAGGCCGCCTCCGGCCTCGCGGGCGTCGCCAAGATCGTCGCCTCC9420              CTCCGCCATGACGCCCTGCCCCCCACCCTCCACACGGGCCCGCGCAATCCCTTGATTGAT9480              TGGGATACACTCGCCATCGACGTCGTTGATACCCCGAGGTCTTGGGCCCGCCACGAAGAT9540              AGCAGTCCCCGCCGCGCCGGCGTCTCCGCCTTCGGACTCTCCGGCACCAACGCCCACGTC9600              ATCCTCGAGGAGGCTCCCGCCGCCCTGTCGGGCGAGCCCGCCACCTCACAGACGGCGTCG9660              CGACCGCTCCCCGCGGCGTGTGCCGTGCTCCTGTCGGCCAGGAGCGAGGCCGCCGTCCGC9720              GCCCAGGCGAAGCGGCTCCGCGACCACCTCCTCGCCCACGACGACCTCGCCCTTATCGAT9780              GTGGCCTATTCGCAGGCCACCACCCGCGCCCACTTCGAGCACCGCGCCGCTCTCCTGGCC9840              CGCGACCGCGACGAGCTCCTCTCCGCGCTCGACTCGCTCGCCCAGGACAAGCCCGCCCCG9900              AGCACCGTTCTCGGCCGGAGCGGAAGCCACGGCAAGGTCGTCTTCGTCTTTCCTGGGCAA9960              GGCTCGCAGTGGGAAGGGATGGCCCTCTCCCTGCTCGACTCCTCGCCGGTCTTCCGCGCT10020             CAGCTCGAAGCATGCGAGCGCGCGCTCGCTCCTCACGTCGAGTGGAGCCTGCTCGCCGTC10080             CTGCGCCGCGACGAGGGCGCCCCCTCCCTCGACCGCGTCGACGTCGTACAGCCCGCCCTC10140             TTTGCCGTCATGGTCTCCCTGGCCGCCCTCTGGCGCTCGCTCGGCGTCGAGCCCGCCGCC10200             GTCGTCGGCCACAGCCAGGGCGAGATCGCCGCCGCCTTCGTCGCAGGCGCTCTCTCCCTC10260             GAGGACGCGGCGCGCATCGCCGCCCTGCGCAGGAAAGCGCTCACCACCGTCGGCGGCAAC10320             GGCGGCATGGCCGCCGTCGAGCTCGGCGCCTCCGACCTCCAGACCTACCTCGCTCCCTGG10380             GGCGACAGGCTCTCCACCGCCGCCGTCAACAGCCCCAGGGCTACCCTCGTATCCGGCGAG10440             CCCGCCGCCGTCGACGCGCTGCTCGACGTCCTCACCGCCACCAAGGTGTTCGCCCGCAAG10500             ATCCGCGTCGACTACGCCTCCCACTCCGCCCAGATGGACGCCGTCCAAGACGAGCTCGCC10560             GCAGGTCTAGCCAACATCGCTCCTCGGACGTGCGAGCTCCCTCTTTATTCGACCGTCACC10620             GGCACCAGGCTCGACGGCTCCGAGCTCGACGGCGCGTACTGGTATCGAAACCTCCGGCAA10680             ACCGTCCTGTTCTCGAGCGCGACCGAGCGGCTCCTCGACGATGGGCATCGCTTCTCCGTC10740             GAGGTCAGCCCCCATCCCGTGCTCACGCTCGCCCTCCGCGAGACCTGCGAGCGCTCACCG10800             CTCGATCCCGTCGTCGTCGGCTCCATTCGACGAGAAGAAGGCCACCTCGCCCGCCTGCTC10860             CTCTCCTGGGCGGAGCTCTCTACCCGAGGCCTCGCGCTCGACTGGAAGGACTTCTTCGCG10920             CCCTACGCTCCCCGCAAGGTCTCCCTCCCCACCTACCCCTTCCAGCGAGAGCGGTTCTGG10980             CTCGACGTCTCCACGGACGAACGCTTCCGACGTCGCCTCCGCAGGCCTGACCTCGGCCGA11040             CCAATCCCGCTGCTCGGCGCCGCCGTCGCCTTCGCCGACCGCGGTGGCTTTCTCTTTACA11100             GGGCGGCTCTCCCTCGCAGAGCACCCGTGGCTCGAAGGCCATGCCGTCTTCGGCACACCC11160             ATCCTACCGGGCACCGGCTTTCTCGAGCTCGCCCTGCACGTCGCCCACCGCGTCGGCCTC11220             GACACCGTCGAAGAGCTCACGCTCGAGGCCCCTCTCGCTCTCCCATCGCAGGACACCGTC11280             CTCCTCCAGATCTCCGTCGGGCCCGTGGACGACGCAGGACGAAGGGCGCTCTCTTTCCAT11340             AGCCGACAAGAGGACGCGCTTCAGGATGGCCCCTGGACTCGCCACGCCAGCGGCTCTCTC11400             TCGCCGGCGACCCCATCCCTCTCCGCCGATCTCCACGAGTGGCCTCCCTCGAGTGCCATC11460             CCGGTGGACCTCGAAGGCCTCTACGCAACCCTCGCCAACCTCGGGCTTGCCTACGGCCCC11520             GAGTTCCAGGGCCTCCGCTCCGTCTACAAGCGCGGCGACGAGCTCTTTGCCGAAGCCAAG11580             CTCCCGGAAGCGGCCGAAAAGGATGCCGCCCGGTTTGCCCTCCACCCTGCGCTGCTCGAC11640             AGCGCCCTGCATGCACTGGCCTTTGAGGACGAGCAGAGAGGGACGGTCGCTCTGCCCTTC11700             TCGTGGAGCGGAGTCTCGCTGCGCTCCGTCGGTGCCACCACCTTGCGCGTGCGCTTCCAC11760             CGTCCCAAGGGTGAATCCTCCGTCTCGATCGTCCTGGCCGACGCCGCAGGTGACCCTCTT11820             GCCTCGGTGCAAGCGCTCGCCATGCGGACGACGTCCGCCGCGCAGCTCCGCACCCCGGCA11880             GCTTCCCACCATGATGCGCTCTTCCGCGTCGACTGGAGCGAGCTCCAAAGCCCCACTTCA11940             CCGCCTGCCGCCCCGAGCGGCGTCCTTCTCGGCACAGGCGGCCACGATCTCGCGCTCGAC12000             GCCCCGCTCGCCCGCTACGCCGACCTCGCTGCCCTCCGAAGCGCCCTCGACCAGGGCGCT12060             TCGCCTCCCGGCCTCGTCGTCGCCCCCTTCATCGATCGACCGGCAGGCGACCTCGTCCCG12120             AGCGCCCACGAGGCCACCGCGCTCGCACTCGCCCTCTTGCAAGCCTGGCTCGCCGACGAA12180             CGCCTCGCCTCGTCGCGCCTCGTCCTCGTCACCCGACGCGCCGTCGCCACCCACACCGAA12240             GACGACGTCAAGGACCTCGCTCACGCGCCGCTCTGGGGGCTCGCGCGCTCCGCGCAAAGT12300             GAGCACCCAGACCTCCCGCTCTTCCTCGTCGACATCGACCTCAGCGAGGCCTCCCAGCAG12360             GCCCTGCTAGGCGCGCTCGACACAGGAGAACGCCAGCTCGCCCTCCGCAACGGGAAACCC12420             CTCATCCCGAGGTTGGCGCAACCACGCTCGACGGACGCGCTCATCCCGCCGCAAGCACCC12480             ACGTGGCGCCTCCATATTCCGACCAAAGGCACCTTCGACGCGCTCGCCCTCGTCGACGCC12540             CCCGAGGCCCAGGCGCCCCTCGCACACGGCCAAGTCCGCATCGCCGTGCACGCGGCAGGG12600             CTCAACTTCCGCGATGTCGTCGACACCCTTGGCATGTATCCGGGCGACGCGCCGCCGCTC12660             GGAGGCGAAGGCGCGGGCATCGTTACTGAAGTCGGTCCAGGTGTCTCCCGATACACCGTA12720             GGCGACCGGGTGATGGGGGTCTTCGGCGCAGCCTTTGGTCCCACGGCCATCGCCGACGCC12780             CGCATGATCTGCCCCATCCCCCACGCCTGGTCCTTCGCCCAAGCCGCCAGCGTCCCCATC12840             ATCTATCTCACCGCCTACTATGGACTCGTCGATCTCGGGCATCTGAAACCCAATCAACGT12900             GTCCTCATCCATGCGGCCGCCGGCGGCGTCGGGACGGCCGCCGTTCAGCTCGCACGCCAC12960             CTCGGCGCCGAGGTCTTTGCCACCGCCAGTCCAGGGAAGTGGAGCGCTCTCCGCGCGCTC13020             GGCTTCGACGATGCGCACCTCGCGTCCTCACGTGACCTGGGCTTCGAGCAGCACTTCCTG13080             CGCTCCACGCATGGGCGCGGCATGGATGTCGTCCTCGACTGTCTGGCACGCGAGTTCGTC13140             GACGCCTCGCTGCGCCTCATGCCGAGCGGTGGACGCTTCATCGAGATGGGAAAGACGGAC13200             ATCCGTGAGCCCGACGCGATCGGCCTCGCCTACCCTGGCGTCGTTTACCGCGCCTTCGAC13260             GTCACAGAGGCCGGACCGGATCGAATTGGGCAGATGCTCGCAGAGCTGCTCAGCCTCTTC13320             GAGCGCGGTGTGCTTCGTCTGCCACCCATCACATCCTGGGACATCCGTCATGCCCCCCAG13380             GCCTTCCGCGCGCTCGCCCAGGCGCGGCATGTTGGGAAGTTCGTCCTCACCATTCCCCGT13440             CCGATCGATCCCGAGGGGACCGTCCTCATCACGGGAGGCACCGGGACGCTAGGAGTCCTG13500             GTCGCACGCCACCTCGTCGCGAAACACAGCGCCAAACACCTGCTCCTCACCTCGAGGAAG13560             GGCGCGCGTGCTCCGGGCGCGGAGGCTCTGCGAAGCGAGCTCGAAGCGCTGGGGGCCTCG13620             GTCACCCTCGTCGCGTGCGACGTGGCCGACCCACGCGCCCTCCGGACCCTCCTGGACAGC13680             ATCCCGAGGGATCATCCGATCACGGCCGTCGTGCACGCCGCCGGCGCCCTCGACGACGGG13740             CCGCTCGGTAGCATGAGCGCCGAGCGCATCGCTCGCGTCTTTGACCCCAAGCTCGATGCC13800             GCTTGGTACTTGCATGAGCTCACCCAGGACGAGCCGGTCGCGGCCTTCGTCCTCTTCTCG13860             GCCGCCTCCGGCGTCCTTGGTGGTCCAGGTCAGTCGAACTACGCCGCTGCCAATGCCTTC13920             CTCGATGCGCTCGCACATCACCGGCGCGCCCAAGGACTCCCAGCCGCTTCGCTCGCCTGG13980             GGCTACTGGGCCGAGCGCAGTGGGATGACCCGGCACCTCAGCGCCGCCGACGCCGCTCGC14040             ATGAGGCGCGCCGGCGTCCGGCCCCTCGACACTGACGAGGCGCTCTCCCTCTTCGATGTG14100             GCTCTCTTGCGACCCGAGCCCGCTCTGGTCCCCGCCCCCTTCGACTACAACGTGCTCAGC14160             ACGAGTGCCGACGGCGTGCCCCCGCTGTTCCAGCGTCTCGTCCGCGCTCGCATCGCGCGC14220             AAGGCCGCCAGCAATACTGCCCTCGCCTCGTCGCTTGCAGAGCACCTCTCCTCCCTCCCG14280             CCCGCCGAACGCGAGCGCGTCCTCCTCGATCTCGTCCGCACCGAAGCCGCCTCCGTCCTC14340             GGCCTCGCCTCGTTCGAATCGCTCGATCCCCATCGCCCTCTACAAGAGCTCGGCCTCGAT14400             TCCCTCATGGCCCTCGAGCTCCGAAATCGACTCGCCGCCGCCGCCGGGCTGCGGCTCCAG14460             GCTACTCTCCTCTTCGACTATCCAACCCCGACTGCGCTCTCACGCTTTTTCACGACGCAT14520             CTCTTCGGGGGAACCACCCACCGCCCCGGCGTACCGCTCACCCCGGGGGGGAGCGAAGAC14580             CCTATCGCCATCGTGGCGATGAGCTGCCGCTTCCCGGGCGACGTGCGCACGCCCGAGGAT14640             CTCTGGAAGCTCTTGCTCGACGGACAAGATGCCATCTCCGGCTTTCCCCAAAATCGCGGC14700             TGGAGTCTCGATGCGCTCGACGCCCCCGGTCGCTTCCCAGTCCGGGAGGGGGGCTTCGTC14760             TACGACGCAGACGCCTTCGATCCGGCCTTCTTCGGGATCAGTCCACGTGAAGCGCTCGCC14820             GTTGATCCCCAACAGCGCATTTTGCTCGAGATCACATGGGAAGCCTTCGAGCGTGCAGGC14880             ATCGACCCGGCCTCCCTCCAAGGAAGCCAAAGCGGGGTCTTCGTTGGCGTATGGCAGAGC14940             GACTACCAATGCATCGCTGGTGAACGCGACTGGCGAATACAAGGACTCGTTGCCACCGGT15000             AGCGCAGCGCGTCCGTCCGGCCGAATCGCATACACGTTCGGACTTCAAGGGCCCGCCATC15060             AGCGTGGAGACGGCGTGCAGCTTCCTCGTCGCGGTTCACCTCGCCTGCCAGGCCCCCCCC15120             CACGGCGAATACTCCCTGGCGCTCGCTGGCGGCGTGACCATCATGGCCACGCCAGCCATA15180             TTCATCGCGTTCGACTCCGAGAGCGCGGGTGCCCCCGACGGTCGCTGCAAGGCCTTCTCG15240             CCGGAAGCCGACGGTTCGGGCTGGGCCGAAGGCGCCGGGATGCTCCTGCTCGAGCGCCTC15300             TCCGATGCCGTCCAAAACGGTCATCCCGTCCTCGCCGTCCTTCGAGGCTCCGCCGTCAAC15360             CAGGACGGCCGGAGCCAAGGCCTCACCGCGCCCAATGGCCCTGCCCAGGAGCGCGTCATC15420             CGGCAAGCGCTCGACAGCGCGCGGCTCACTCCAAAGGACGTCGACGTCGTCGAGGCTCAC15480             GGCACGGGAACCACCCTCGGAGACCCCATCGAGGCACAGGCCGTTTTTGCCACCTATGGC15540             GAGGCCCATTCCCAAGACAGACCCCTCTGGCTTGGAAGCCTCAAGTCCAACCTGGGACAT15600             ACTCAGGCCGCGGCCGGCGTCGGCGGCATCATCAAGATGGTGCTCGCGTTGCAGCACGGT15660             CTCTTGCCCAAGACCCTCCATGCCCAGAATCCCTCCCCCCACATCGACTGGTCTCCAGGC15720             ATCGTAAAGCTCCTGAACGAGGCCGTCGCCTGGACGACCAGCGGACATCCTCGCCGCGCC15780             GGTGTTTCCTCGTTCGGCGTCTCCGGCACCAACGCCCATGTCATCCTCGAAGAGGCTCCC15840             GCCGCCACGCGGGCCGAGTCAGGCGCTTCACAGCCTGCATCGCAGCCGCTCCCCGCGGCG15900             TGGCCCGTCGTCCTGTCGGCCAGGAGCGAGGCCGCCGTCCGCGCCCAGGCTCAAAGGCTC15960             CGCGAGCACCTGCTCGCCCAAGGCGACCTCACCCTCGCCGATGTGGCCTATTCGCTGGCC16020             ACCACCCGCGCCCACTTCGAGCACCGCGCCGCTCTCGTAGCCCACGACCGCGACGAGCTC16080             CTCTCCGCGCTCGACTCGCTCGCCCAGGACAAGCCCGCACCGAGCACCGTCCTCGGACGG16140             AGCGGAAGCCACGGCAAGGTCGTCTTCGTCTTTCCTGGGCAAGGCTCGCAGTGGGAAGGG16200             ATGGCCCTCTCCCTGCTCGACTCCTCGCCCGTCTTCCGCACACAGCTCGAAGCATGCGAG16260             CGCGCGCTCCGTCCTCACGTCGAGTGGAGCCTGCTCGCCGTCCTGCGCCGCGACGAGGGC16320             GCCCCCTCCCTCGACCGCGTCGACGTCGTGCAGCCCGCCCTCTTTGCCGTCATGGTCTCC16380             CTGGCCGCCCTCTGGCGCTCGCTCGGCGTCGAGCCCGCCGCCGTCGTCGGCCACAGCCAG16440             GGCGAGATAGCCGCCGCCTTCGTCGCAGGCGCTCTCTCCCTCGAGGACGCGGCCCGCATC16500             GCCGCCCTGCGCAGCAAAGCGTCACCACCGTCGCCGGCAACGGGCATGGCCGCCGTCGAG16560             CTCGGCGCCTCCGACCTCCAGACCTACCTCGCTCCCTGGGGCGACAGGCTCTCCATCGCC16620             GCCGTCAACAGCCCCAGGGCCACGCTCGTATCCGGCGAGCCCGCCGCCGTCGACGCGCTG16680             ATCGACTCGCTCACCGCAGCGCAGGTCTTCGCCCGAAGAGTCCGCGTCGACTACGCCTCC16740             CACTCAGCCCAGATGGACGCCGTCCAAGACGAGCTCGCCGCAGGTCTAGCCAACATCGCT16800             CCTCGGACGTGCGAGCTCCCTCTTTATTCGACCGTCACCGGCACCAGGCTCGACGGCTCC16860             GAGCTCGACGGCGCGTACTGGTATCGAAACCTCCGGCAAACCGTCCTGTTCTCGAGCGCG16920             ACCGAGCGGCTCCTCGACGATGGGCATCGCTTCTTCGTCGAGGTCAGCCCTCATCCCGTG16980             CTCACGCTCGCCCTCCGCGAGACCTGCGAGCGCTCACCGCTCGATCCCGTCGTCGTCGGC17040             TCCATTCGACGCGACGAAGGCCACCTCCCCCGTCTCCTTGCTCTCTTGGGCCGAGCTCTA17100             TGGCCGGGCCTCACGCCCGAGTGGAAGGCCTTCTTCGCGCCCTTCGCTCCCCGCAAGGTC17160             TCACTCCCCACCTACGCCTTCCAGCGCGAGCGTTTCTGGCTCGACGCCCCCAACGCACAC17220             CCCGAAGGCGTCGCTCCCGCTGCGCCGATCGATGGGCGGTTTTGGCAAGCCATCGAACGC17280             GGGGACCTCGACGCGCTCAGCGGCCAGCTCCACGCGGACGGCGACGAGCAGCGCGCCGCC17340             CTCGCCCTGCTCCTTCCCACCCTCTCGAGCTTTCACCACCAGCGCCAAGAGCAGAGCACG17400             GTCGACACCTGGCGCTACCGCATCACGTGGAGGCCTCTGACCACCGCCGCCACGCCCGCC17460             GACCTCGCCGGCACCTGGCTCCTCGTCGTGCCGTCCGCGCTCGGCGACGACGCGCTCCCT17520             GCCACGCTCACCGATGCGCTTACCCGGCGCGGCGCGCGTGTCCTCGCGCTGCGCCTGAGC17580             CAGGTTCACATAGGCCGCGCGGCTCTCACCGAGCACCTGCGCGAGGCTGTTGCCGAGACT17640             GCCCCGATTCGCGGCGTGCTCTCCCTCCTCGCCCTCGACGAGCGCCCCCTCGCGGACCAT17700             GCCGCCCTGCCCGCGGGCCTTGCCCTCTCGCTCGCCCTCGTCCAAGCCCTCGGCGACCTC17760             GCCCTCGAGGCTCCCTTGTGGCTCTTCACGCGCGGCGCCGTCTCGATTGGACACTCCGAC17820             CCACTCGCCCATCCCACCCAGGCCATGATCTGGGGCTTGGGCCGCGTCGTCGGCCTCGAG17880             CACCCCGAGCGGTGGGGCGGGCTCGTCGACCTCGGCGCAGCGCTCGACGCGAGCGCCGCA17940             GGCCGCTTGCTCCCGGCCCTCGCCCAGCGCCACGACGAAGACCAGCTCGCGCTGCGCCCG18000             GCCGGCCTCTACGCACGCCGCTTCGTCCGCGCCCCGCTCGGCGATGCGCCTGCCGCTCGC18060             GGCTTCATGCCCCGAGGCACCATCCTCATCACCGGTGGTACCGGCGCCATTGGCGCTCAC18120             GTCGCCCGATGGCTCGCTCGAAAAGGCGCTGAGCACCTCGTCCTCATCAGCCGACGAGGG18180             GCCCAGGCCGAAGGCGCCGTGGAGCTCCACGCCGAGCTCACCGCCCTCGGCGCGCGCGTC18240             ACCTTCGCCGCGTGCGATGTCGCCGACAGGAGCGCTGTCGCCACGCTTCTCGAGCAGCTC18300             GACGCCGGAGGGCCACAGGTGAGCGCCGTGTTCCACGCGGGCGGCATCGAGCCCCACGCT18360             CCGCTCGCCGCCACCTCCATGGAGGATCTCGCCGAGGTTGTCTCCGGCAAGGTACAAGGT18420             GCAAGACACCTCCACGACCTGCTCGGCTCTCGACCCCTCGACGCCTTTGTTCTCTTCTCG18480             TCCGGCGCGGTCGTCTGGGGCGGCGGACAACAAGGCGGCTATGCCGCTGCGAACGCCTTC18540             CTCGATGCCCTGGCCGAGCAGCGGCGCAGCCTTGGGCTGACGGCGACATCGGTGGCCTGG18600             GGCGTGTGGGGCGGCGGCGGCATGGCTACCGGGCTCCTGGCAGCCCAGCTAGAGCAACGC18660             GGTCTGTCGCCGATGGCCCCCTCGCTGGCCGTGGCGACGCTCGCGCTGGCGCTGGAGCAC18720             GACGAGACCACCCTCACCGTCGCCGACATCGACTGGGCGCGCTTTGCGCCTTCGTTCAGC18780             GCCGCTCGCTCCCGCCCGCTCCTGCGCGATTTGCCCGAGGCGCAGCGCGCTCTCGAAGCC18840             AGCGCCGATGCGTCCTCCGAGCAAGACGGGGCCACAGGCCTCCTCGACAAGCTCCGAAAC18900             CGCTCGGAGAGCGAGCAGATCCACCTGCTCTCCTCGCTGGTGCGCCACGAAGCGGCCCTC18960             GTCCTGGGCCATACCGACGCCTCCCAGGTCGACCCCCACAAGGGCTTCATGGACCTCGGC19020             CTCGATTCGCTCATGACCGTCGAGCTTCGTCGGCGCTTGCAGCAGGCCACCGGCATCAAG19080             CTCCCGGCCACCCTCGCCTTCGACCATCCCTCTCCTCATCGCGTCGCGCTCTTCTTGCGC19140             GACTCGCTCGCCCACGCCCTCGGCGCGAGGCTCTCCGTCGAGCGCGACGCCGCCGCGCTC19200             CCGGCGCTTCGCTCGGCGAGCGACGAGCCCATCGCCATCGTCGGCATGGCCCTCCGCTTG19260             CCGGGCGGCATCGGCGATGTCGACGCTCTTTGGGAGTTCCTCGCCCAAGGACGCGACGCC19320             GTCGAGCCCATTCCCCATGCCCGATGGGATGCCGGTGCCCTCTACGACCCCGACCCCGAC19380             GCCAAGGCCAAGAGCTACGTCCGGCATGCCGCCATGCTCGACCAGGTCGACCTCTTCGAT19440             CCTGCCTTCTTTGGCATCAGCCCTCGCGAGGCCAAATACCTCGACCCCCAGCACCGCCTG19500             CTCCTCGAATCTGCCTGGCTGGCCCTCGAGGACGCCGGCATCGTCCCCTCCACCCTCAAG19560             GATTCTCCCACCGGCGTCTTCGTCGGCATCGGCGCCAGCGAATACGCACTGCGAAACACG19620             AGCTCCGAAGAGGTCGAAGCGTATGCCCTCCAAGGCACCGCCGGGTCCTTTGCCGCGGGG19680             CGCTTGGCCTACACGCTCGGCCTGCAAGGGCCCGCGCTCTCGGTCGACACCGCCTGCTCC19740             TCCTCGCTCGTCGCCCTCCACCTCGCCTGCCAAGCCCTCCGACAGGGCGAGTGCAACCTC19800             GCCCTCGCCGCGGGCGTCTCCGTCATGGCCTCCCCCGGGCTCTTCGTCGTCCTTTCCCGC19860             ATGCGTGCTTTGGCGCCCGATGGCCGCTCCAAGACCTTCTCGACCAACGCCGACGGCTAC19920             GGACGCGGAGAGGGCGTCGTCGTCCTTGCCCTCGAGCGGCTCGGCGACGCCCTCGCCCGA19980             GGACACCGCGTCCTCGCCCTCGTCCGCGGCACCGCCATGAACCATGACGGCGCGTCGAGC20040             GGCATCACCGCCCCCAATGGCACCTCCCACCAGAAGGTCCTCCGCGCCGCGCTCCACGAC20100             GCCCATATCGGCCCTGCCGACGTCGACGTCGTCGAATGCCATGGCACCGGCACCTCCTTG20160             GGAGACCCCATCGAGGTGCAAGCCCTGGCCGCCGTCTACGCCGATGGCAGACCCGCTGAA20220             AAGCCTCTCCTTCTCGGCGCACTCAAGACCAACATTGGCCATCTCGAGGCCGCCTCCGGC20280             CTCGCGGGCGTCGCCAAGATCGTCGCCTCCCTCCGCCATGACGCCCTGCCCCCCACCCTC20340             CACACGACCCCGCGCAATCCCCTGATCGAGTGGGATGCGCTCGCCATCGACGTCGTCGAT20400             GCCACGAGGGCGTGGGCCCGCCACGAAGATGGCAGTCCCCGCCGCGCCGGCGTCTCCGCC20460             TTCGGACTCTCCGGCACCAACGCCCACGTTATCCTCGAAGAGGCTCCCGCGATCCCGCAG20520             GCCGAGCCCACCGCGGCACAGCTCGCGTCGCAGCCGCTTCCCGCAGCCTGGCCCGTGCTC20580             CTGTCGGCCAGGAGCGAGCCGGCCGTGCGCGCCCAGGCCCAGAGGCTCCGCGACCACCTC20640             CTCGCCCACGACGACCTCGCCCTGGCCGATGTAGCCTACTCGCTCGCCACCACCCGGGCT20700             ACCTTCGAGCACCGTGCCGCTCTCGTGGTCCACGACCGCGAAGAGCTCCTCTCCGCGCTC20760             GATTCGCTCGCCCAGGGAAGGCCCGCCCCGAGCACCGTCGTCGAACGAAGCGGAAGCCAC20820             GGCAAGGTCGTCTTCGTCTTTCCTGGGCAAGGCTCGCAGTGGGAAGGGATGGCCCTCTCC20880             CTGCTCGATACCTCGCCGGTCTTCCGGGCACAGCTCGAAGCGTGCGAGCGCGCCCTCGCG20940             CCCCACGTGGACTGGTCGCTGCTCGCGGTGCTCCGCGGCGAGGAGGGCGCGCCCCCGCTC21000             GACCGGGTCGACGTGGTCCAGCCCGCGCTGTTCTCGATGATGGTCTCGCTGGCCGCCCTG21060             TGGCGCTCCATGGGCGTCGAGCCCGACGCGGTGGTCGGCCATAGCCAGGGCGAGATCGCC21120             GCGGCCTGTGTGGCGGGCGCGCTGTCGCTCGAGGACGCTGCCAAGCTGGTGGCGCTGCGC21180             AGCCGTGCGCTCGTGGAGCTCGCCGGCCAGGGGGCCATGGCCGCGGTGGAGCTGCCGGAG21240             GCCGAGGTCGCACGGCGCCTCCAGCGCTATGGCGATCGGCTCTCCATCGGGGCGATCAAC21300             AGCCCTCGTTTCACGACGATCTCCGGCGAGCCCCCTGCCGTCGCCGCCCTGCTCCGCGAT21360             CTGGAGTCCGAGGGCGTCTTCGCCCTCAAGCTGAGTTACGACTTCGCCTCCCACTCCGCG21420             CAGGTCGAGTCGATTCGCGACGAGCTCCTCGATCTCCTGTCGTGGCTCGAGCCGCGCTCG21480             ACGGCGGTCCCGTTCTACTCCACGGTGAGCGGCGCCGCGATCGACGGGAGCGAGCTCGAC21540             GCCGCCTACTGGTACCGGAACCTCCGGCAGCCGGTCCGCTTCGCAGACGCTGTGCAAGGC21600             CTCCTTGCCGGAGAACATCGCTTCTTCGTGGAGGTGAGCCCCAGTCCTGTGCTGACCTTG21660             GCCTTGCACGAGCTCCTCGAAGCGTCGGAGCGCTCGGCGGCGGTGGTCGGCTCTCTGTGG21720             AGCGACGAAGGGGATCTACGGCGCTTCCTCGTCTCGCTCTCCGAGCTCTACGTCAACGGC21780             TTCGCCCTGGATTGGACGACGATCCTGCCCCCCGGGAAGCGGGTGCCGCTGCCCACCTAC21840             CCCTTCCAGCGCGAGCGCTTCTGGCTCGACGCCTCCACGGCACCCGCCGCCGGCGTCAAC21900             CACCTTGCTCCGCTCGAGGGGCGGTTCTGGCAGGCCATCGAGAGCGGGAATATCGACGCG21960             CTCAGCGGCCAGCTCCACGTGGACGGCGACGAGCAGCGCGCCGCCCTTGCCCTGCTCCTT22020             CCCACCCTCGCGAGCTTTCGCCACGAGCGGCAAGAGCAGGGCACGGTCGACGCCTGGCGC22080             TACCGCATCACGTGGAAGCCTCTGACCACCGCCACCACGCCCGCCGACCTGGCCGGCACC22140             TGGCTCCTCGTCGTGCCGGCCGCTCTGGACGACGACGCGCTCCCCTCCGCGCTCACCGAG22200             GCGCTCGCCCGGCGCGGCGCGCGCGTCCTCGCCGTGCGCCTGAGCCAGGCCCACCTGGAC22260             CGCGAGGCTCTCGCCGAGCACCTGCGCCAGGCTTGCGCCGAGACCGCGCCGCCTCGCGGC22320             GTGCTCTCGCTCCTCGCCCTCGACGAAAGTCCCCTCGCCGACCATGCCGCCGTGCCCGCG22380             GGACTCGCCTTCTCGCTCACCCTCGTCCAAGCCCTCGGCGACATCGCCCTCGACGCGCCC22440             TTGTGGCTCTTCACCCGCGGCGCCGTCTCCGTCGGACACTCCGACCCCATCGCCCATCCG22500             ACGCAGGCGATGACCTGGGGCCTGGGCCGCGTCGTCGGCCTCGAGCACCCCGAGCGCTGG22560             GGAGGGCTCGTCGACGTCGGCGCAGCGATCGACGCGAGCGCCGTGGGCCGCTTGCTCCCG22620             GTCCTCGCCCTGCGCAACGATGAGGACCAGCTCGCTCTCCGCCCGGCCGGGTTCTACGCT22680             CGCCGCCTCGTCCGCGCTCCGCTCGGCGACGCGCCGCCCGCACGTACCTTCAAGCCCCGA22740             GGCACCCTCCTCATCACCGGAGGCACCGGCGCCGCTGGCGCTCACGTCGCCCGATGGCTC22800             GCTCGAGAAGGCGCAGAGCACCTCGTCCTCATCAGCCGCCGAGGGGCCCAGGCCGAGGGC22860             GCCTCGGAGCTCCACGCCGAGCTCACGGCCCTGGGCGCGCGCGTCACCTTCGCCGCGTGT22920             GATGTCGCCGACAGGAGCGCTGTCGCCACGCTTCTCGAGCAGCTCGACGCCGAAGGGTCG22980             CAGGTCCGCGCCGTGTTCCACGCGGGCGGCATCGGGCGCCACGCTCCGCTCGCCGCCACC23040             TCTCTCATGGAGCTCGCCGACGTTGTCTCTGCCAAGGTCCTAGGCGCAGGGAACCTCCAC23100             GACCTGCTCGGTCCTCGACCCCTCGACGCCTTCGTCCTTTTCTCGTCCATCGCAGGCGTC23160             TGGGGCGGCGGACAACAAGCCGGATACGCCGCCGGAAACGCCTTCCTCGACGCCCTGGCC23220             GACCAGCGGCGCAGTCTTGGACAGCCGGACACGTCCGTGGTGTGGGGCGCGTGGGGCGGC23280             GGCGGTGGTATATTCACGGGGCCCCTGGCAGCCCAGCTGGAGCAACGTCGTCTGTCGCCG23340             ATGGCCCCTTCGCTGGCCGTGGCGGCGCTCGCGCAAGCCCTGGAGCACGACGAGACCACC23400             GTCACCGTCGCCGACATCGACTGGGCGCGCTTTGCGCCTTCGATCAGCGTCGCTCGCTCC23460             CGCCGCTCCTGCGCGACTTGCCCGAGCAGCGCGCCCTCGAAGACAGAGAAGGCGCGTCCT23520             CCTCCGAGCACGGCCCGGCCCCCCGACCTCCTCGACAAGCTCCGGAGCCGCTCGGAGAGC23580             GAGCAGCTCCGTCTGCTCGCCGCGCTGGTGTGCGACGAGACGGCCCTCGTCCTCGGCCAC23640             GAAGGCCGCTTCCCAGCTCGACCCCGACAAGGCTTCTTCGACCTCGGTCTCGATTCGATC23700             ATGACCGTCGAGCTTCGTCGGCGCTTGCAACAGGCCACCGGCATCAAGCTCCCGGCCACC23760             CTCGCCTTCGACCATCCCTCTCCTCATCGCGTCGCGCTCTTCATGCGCGACTCGCTCGCC23820             CACGCCCTCGGCACGAGGCTCTCCGCCGAGGCGACGCCGCCGCGCTCCGGCCGCGCCTCG23880             AGCGACGAGCCCATCGCCATCGTCGGCATGGCCCTGCGCCTGCCGGGCGGCGTCGGCGAT23940             GTCGACGCTCTTTGGGAGTTCCTCCACCAAGGGCGCGACGCGGTCGAGCCCATTCCACAG24000             AGCCGCTGGGACGCCGGTGCCCTCTACGACCCCGACCCCGACGCCGACGCCAAGAGCTAC24060             GTCCGGCATGCCGCGATGCTCGACCAGATCGACCTCTTCGACCCTGCCTTCTTCGGCATC24120             AGCCCCCGGGAGGCCAAACACCTCGACCCCCAGCACCGCCTGCTCCTCGAATCTGCCTGG24180             CTGGCCCTCGAGGACGCCGGCATCGTCCCCACCTCCCTCAAGGACTCCCTCACCGGCGTC24240             TTCGTCGGCATCTGCGCCGGCGAATACGCGATGCAAGAGGCGAGCTCGGAAGGTTCCGAG24300             GTTTACTTCATCCAAGGCACTTCCGCGTCCTTTGGCGCGGGGGGCTTGGCCTATACGCTC24360             GGGCTCCAGGGGCCGCGATCTTCGGTCGACACCGCCTGCTCCTCCTCGCTCGTCTCCCTC24420             CACCTCGCCTGCCAAGCCCTCCGACAGGGCGAGTGCAACCTCGCCCTCGCCGCGGGCGTG24480             TCGCTCATGGTCTCCCCCCAGACCTTCGTCATCCTTTCCCGTCTGCGCGCCTTGGCGCCC24540             GACGGCCGCTCCAAGACCTTCTCGGACAACGCCGACGGCTACGGACGCGGAGAAGGCGTC24600             GTCGTCCTTGCCCTCGAGCGGATCGGCGACGCCCTCGCCCGGAGACACCGCGTCCTCGTC24660             CTCGTCCGCGGCACCGCCATCAACCACGACGGCGCGTCGAGCGGTATCACCGCCCCCAAC24720             GGCACCTCCCAGCAGAAGGTCCTCCGGGCCGCGCTCCACGACGCCCGCATCACCCCCGCC24780             GACGTCGACGTCGTCGAGTGCCATGGCACCGGCACCTCGCTGGGAGACCCCATCGAGGTG24840             CAAGCCCTGGCCGCCGTCTACGCCGACGGCAGACCCGCTGAAAAGCCTCTCCTTCTCGGC24900             GCGCTCAAGACCAACATCGGCCATCTCGAGGCCGCCTCCGGCCTCGCGGGCGTCGCCAAG24960             ATGGTCGCCTCGCTCCGCCACGACGCCCTGCCCCCCACCCTCCACGCGACCCCACGCAAT25020             CCCCTCATCGAGTGGGAGGCGCTCGCCATCGACGTCGTCGATACCCCGAGGCCTTGGCCC25080             CGCCACGAAGATGGCAGTCCCCGCCGCGCCGGCATCTCCGCCTTCGGATTCTCGGGCACC25140             AACGCCCACGTCATCCTCGAAGAGGCTCCCGCCGCCCTGCCGGCCGAGCCCGCCACCTCA25200             CAGCCGGCGTCGCAAGCCGCTCCCGCGGCGTGGCCCGTGCTCCTGTCGGCCAGGAGCGAG25260             GCCGCCGTCCGCGCCCAGGCGAAGCGGCTCCGCGACCACCTCGTCGCCCACGACGACCTC25320             ACCCTCGCGGATGTGGCCTATTCGCTGGCCACCACCCGCGCCCACTTCGAGCACCGCGCC25380             GCTCTCGTAGCCCACAACCGCGACGAGCTCCTCTCCGCGCTCGACTCGCTCGCCCAGGAC25440             AAGCCCGCCCCGAGCACCGTCCTCGGACGGAGCGGAAGCCACGGCAAGCTCGTCTTCGTC25500             TTTCCTGGGCAAGGCTCGCAGTGGGAAGGGATGGCCCTCTCGCTGCTCGACTCCTCGCCC25560             GTCTTCCGCGCTCAGCTCGAAGCATGCGAGCGCGCGCTCGCTCCTCACGTCGAGTGGAGC25620             CTGCTCGCCGTCCTGCGCCGCGACGAGGGCGCCCCCTCCCTCGACCGCGTCGACGTCGTA25680             CAGCCCGCCCTCTTTGCCGTCATGGTCTCCCTGGCGGCCCTCTGGCGCTCGCTCGGCGTA25740             GAGCCCGCCGCCGTCGTCGGCCACAGTCAGGGCGAGATCGCCGCCGCCTTCGTCGCAGGC25800             GCTCTCTCCCTCGAGGACGCGGCCCGCATCGCCGCCCTGCGCAGCAAAGCGCTCACCACC25860             GTCGCCGGCAACGGGGCCATGGCCGCCGTCGAGCTCGGCGCCTCCGACCTCCAGACCTAC25920             CTCGCTCCCTGGGGCGACAGGCTCTCCATCGCCGCCGTCAACAGCCCCAGGGCCACGCTC25980             GTGTCCGGCGAGCCCGCCGCCATCGACGCGCTGATCGACTCGCTCACCGCAGCGCAGGTC26040             TTCGCCCGAAAAGTCCGCGTCGACTACGCCTCCCACTCCGCCCAGATGGACGCCGTCCAA26100             GACGAGCTCGCCGCAGGTCTAGCCAACATCGCTCCTCGGACGTGCGAGCTCCCTCTTTAT26160             TCGACCGTCACCGGCACCAGGCTCGACGGCTCCGAGCTCGACGGCGCGTACTGGTATCGA26220             AACCTCCGGCAAACCGTCCTGTTCTCGAGCGCGACCGAGCGGCTCCTCGACGATGGGCAT26280             CGCTTCTTCGTCGAGGTCAGCCCCCATCCCGTGCTCACGCTCGCCCTCCGCGAGACCTGC26340             GAGCGCTCACCGCTCGATCCCGTCGTCGTCGGCTCCATTCGACGCGACGAAGGCCACCTC26400             GCCCGCCTGCTCCTCTCCTGGGCGGAGCTCTCTACCCGAGGCCTCGCGCTCGACTGGAAC26460             GCCTTCTTCGCGCCCTTCGCTCCCCGCAAGGTCTCCCTCCCCACCTACCCCTTCCAACGC26520             GAGCGCTTCTGGCTCGACGCCTCCACGGCGCACGCTGCCGACGTCGCCTCCGCAGGCCTG26580             ACCTCGGCCGACCACCCGCTGCTCGGCGCCGCCGTCGCCCTCGCCGACCGCGATGGCTTT26640             GTCTTCACAGGACGGCTCTCCCTCGCAGAGCACCCGTGGCTCGAAGACCACGTCGTCTTC26700             GGCATACCCTGTCCTGCCAGGCGCCGCCTCCTCGAGCTCGCCCTGCATGTCGCCCATCTC26760             GTCGGCCTCGACACCGTCGAAGACGTCACGCTCGACCCCCCCCTCGCTCTCCCATCGCAG26820             GGCGCCGTCCTCCTCCAGATCTCCGTCGGGCCCGCGGACGGTGCTGGACGAAGGGCGCTC26880             TCCGTTCATAGCCGGCGCCACGACGCGCTTCAGGATGGCCCCTGGACTCGCCACGCCAGC26940             GGCTCTCTCGCGCAAGCTAGCCCGTCCCATTGCCTTCGATGCTCCGCGAATGGCCCCCCC27000             TCGGGCGCCACCCAGGTGGACACCCAAGGTTTCTACGCAGCCCTCGAGAGCGCTGGGCTT27060             GCTTATGGCCCCGAGTTCCAGGGCCTCCGCCGCCGTCTACAAGCGCGGCGACGAGCTCTT27120             CGCCGAAGCCAAGCTCCCGGACGCCGCCGAAGAGGACGCCGCTCGTTTTGCCCTCCACCC27180             CGCCCTGCTCGACAGCGCCTTGCAGGCGCTCGCCTTTGTAGACGACCAGGCAAAGGCCTT27240             CAGGATGCCCTTCTCGTGGAGCGGAGTATCGCTGCGCTCCGGTCGGAGCCACCACCCTGC27300             GCGTGCGTTTCCACCGTCCTGAGGGCGAATCCTCGCGCTCGCTCCTCCTCGCCGACGCCA27360             GAGGCGAACCCATCGCCTCGGTGCAAGCGCTCGCCATGCGCGCCGCGTCCGCCGAGCAGC27420             TCCGCAGACCCGGGAGCGTCCCACCTCGATGCCCTCTTCCGCATCGACTGGAGCGAGCTG27480             CAAAGCCCCACCTCACCGCCCATCGCCCCGAGCGGTGCCCTCCTCGGCACAGAAGGTCTC27540             GACCTCGGGACCAGGGTGCCTCTCGACCGCTATACCGACCTTGCTGCTCTACGCAGCGCC27600             CTCGACCAGGGCGCTTCGCCTCCAAGCCTCGTCATCGCCCCCTTCATCGCTCTGCCCGAA27660             GGCGACCTCATCGCGAGCGCCCGCGAGACCACCGCGCACGCGCTCGCCCTCTTGCAAGCC27720             TGGCTCGCCGACGAGCGCCTCGCCTCCTCGCGCCTCGCCCTCGTCACCCGACGCGCCGTC27780             GCCACCCACGCTGAAGAAGACGTCAAGGGCCTCGCTCACGCGCCTCTCTGGGGTCTCGCT27840             CGCTCCGCGCAGAGCGAGCACCCAGAGCGCCCTCTCGTCCTCGTCGACCTCGACGACAGC27900             GAGGCCTCCCAGCACGCCCTGCTCGGCGCGCTCGACGCAAGAGAGCCAGAGATCGCCCTC27960             CGCAACGGCAAACCCCTCGTTCCAAGGCTCTCACGCCTGCCCCAGGCGCCCACGGACACA28020             GCGTCCCCCGCAGGCCTCGGAGGCACCGTCCTCATCACGGGAGGCACCGGCACGCTCGGC28080             GCCCTGGTCGCGCGCCGCCTCGTCGTAAACCACGACGCCAAGCACCTGCTCCTCACCTCG28140             CGCCAGGGCGCGAGCGCTCCGGGTGCTGATGTCTTGCGAAGCGAGCTCGAAGCTCTGGGG28200             GCTTCGGTCACCCTCGCCGCGTGCGACGTGGCCGATCCACGCGCTCTAAAGGACCTTCTG28260             GATAACATTCCGAGCGCTCACCCGGTCGCCGCCGTCGTGCATGCCGCCAGCGTCCTCGAC28320             GGCGATCTGCTCGGCGCCATGAGCCTCGAGCGGATCGACCGCGTCTTCGCCCCCAAGATC28380             GATGCCGCCTGGCACTTGCATCAGCTCACCCAAGATAAGCCCCTTGCCGCCTTCATCCTC28440             TTCTCGTCCGTCGCCGGCGTCCTCGGCAGCTCAGGTCACTCCAACTACGCCGCTGCGAGC28500             GCCTTCCTCGATGCGCTTGCGCACCACCGGCGCGCGCAAGGGCTCCCTGCCTCATCGCTC28560             GCGTGGAGCCACTGGGCCGAGCGCAGCGCAATGACAGAGCACGTCAGCGCCGCCGGCGCC28620             CCTCGCATGGAGCGCGCCGGCCTTCCCTCGACCTCTGAGGAGAGGCTCGCCCTCTTCGAT28680             GCGGCGCTCTTCCGAACCGAGACCGCCCTGGTCCCCGCGCGCTTCGACTTGAGCGCGCTC28740             AGGGCGAACGCCGGCAGCGTCCCCCCGTTGTTCCAACGTCTCGTCCGCGCTCGCACCGTA28800             CGCAAGGCCGCCAGCAACACCGCCCAGGCCTCGTCGCTTACAGAGCGCCTCTCAGCCCTC28860             CCGCCCGCCGAACGCGAGCGTGCCCTGCTCGATCTCATCCGCACCGAAGCCGCCGCCGTC28920             CTCGGCCTCGCCTCCTTCGAATCGCTCGATCCCGATCG28958                                   (2) INFORMATION FOR SEQ ID NO:7:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 13 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: other nucleic acid                                        (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (ix) FEATURE:                                                                 (A) NAME/KEY: misc.sub.-- feature                                             (B) LOCATION: 1..13                                                           (D) OTHER INFORMATION: /note= "sequence of a plant                            consensus translation initiator (Clontech)"                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:                                       GTCGACCATGGTC13                                                               (2) INFORMATION FOR SEQ ID NO:8:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 12 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: other nucleic acid                                        (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (ix) FEATURE:                                                                 (A) NAME/KEY: misc.sub.-- feature                                             (B) LOCATION: 1..12                                                           (D) OTHER INFORMATION: /note= "sequence of a plant                            consensus translation initiator (Joshi)"                                      (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:                                       TAAACAATGGCT12                                                                (2) INFORMATION FOR SEQ ID NO:9:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 22 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: other nucleic acid                                        (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (ix) FEATURE:                                                                 (A) NAME/KEY: misc.sub.-- feature                                             (B) LOCATION: 1..22                                                           (D) OTHER INFORMATION: /note= "sequence of an                                 oligonucleotide for use in a molecular adaptor"                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:                                       AATTCTAAAGCATGCCGATCGG22                                                      (2) INFORMATION FOR SEQ ID NO:10:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 21 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: other nucleic acid                                        (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (ix) FEATURE:                                                                 (A) NAME/KEY: misc.sub.-- feature                                             (B) LOCATION: 1..21                                                           (D) OTHER INFORMATION: /note= "sequence of an                                 oligonucleotide for use in a molecular adaptor"                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:                                      AATTCCGATCGGCATGCTTTA21                                                       (2) INFORMATION FOR SEQ ID NO:11:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 22 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: other nucleic acid                                        (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (ix) FEATURE:                                                                 (A) NAME/KEY: misc.sub.-- feature                                             (B) LOCATION: 1..22                                                           (D) OTHER INFORMATION: /note= "sequence of an                                 oligonucleotide for use in a molecular adaptor"                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:                                      AATTCTAAACCATGGCGATCGG22                                                      (2) INFORMATION FOR SEQ ID NO:12:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 21 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: other nucleic acid                                        (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (ix) FEATURE:                                                                 (A) NAME/KEY: misc.sub.-- feature                                             (B) LOCATION: 1..21                                                           (D) OTHER INFORMATION: /note= "sequence of an                                 oligonucleotide for use in a molecular adaptor"                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:                                      AATTCCGATCGCCATGGTTTA21                                                       (2) INFORMATION FOR SEQ ID NO:13:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 15 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: other nucleic acid                                        (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (ix) FEATURE:                                                                 (A) NAME/KEY: misc.sub.-- feature                                             (B) LOCATION: 1..15                                                           (D) OTHER INFORMATION: /note= "sequence of an                                 oligonucleotide for use in a molecular adaptor"                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:                                      CCAGCTGGAATTCCG15                                                             (2) INFORMATION FOR SEQ ID NO:14:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 19 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: other nucleic acid                                        (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (ix) FEATURE:                                                                 (A) NAME/KEY: misc.sub.-- feature                                             (B) LOCATION: 1..19                                                           (D) OTHER INFORMATION: /note= "sequence of an                                 oligonucleotide for use in a molecular adaptor"                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:                                      CGGAATTCCAGCTGGCATG19                                                         (2) INFORMATION FOR SEQ ID NO:15:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 11 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: other nucleic acid                                        (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (ix) FEATURE:                                                                 (A) NAME/KEY: misc.sub.-- feature                                             (B) LOCATION: 1..11                                                           (D) OTHER INFORMATION: /note= "oligonucleotide used to                        introduce base change into SphI site of ORF1 of                               pyrrolnitrin gene cluster"                                                    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:15:                                      CCCCCTCATGC11                                                                 (2) INFORMATION FOR SEQ ID NO:16:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 11 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: other nucleic acid                                        (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (ix) FEATURE:                                                                 (A) NAME/KEY: misc.sub.-- feature                                             (B) LOCATION: 1..11                                                           (D) OTHER INFORMATION: /note= "oligonucleotide used to                        introduce base change into SphI site of ORF1 of                               pyrrolnitrin gene cluster"                                                    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16:                                      GCATGAGGGGG11                                                                 (2) INFORMATION FOR SEQ ID NO:17:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 4603 base pairs                                                   (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (ix) FEATURE:                                                                 (A) NAME/KEY: CDS                                                             (B) LOCATION: 230..1594                                                       (D) OTHER INFORMATION: /gene= "phz1"                                          /label= ORF1                                                                  /note= "Open Reading Frame #1 for DNA sequence"                               (ix) FEATURE:                                                                 (A) NAME/KEY: CDS                                                             (B) LOCATION: 1598..2758                                                      (D) OTHER INFORMATION: /gene= "phz2"                                          /label= ORF2                                                                  /note= "Open Reading Frame #2 for DNA sequence"                               (ix) FEATURE:                                                                 (A) NAME/KEY: CDS                                                             (B) LOCATION: 2764..3597                                                      (D) OTHER INFORMATION: /gene= "phz3"                                          /label= ORF3                                                                  /note= "Open Reading Frame #3 for DNA sequence"                               (ix) FEATURE:                                                                 (A) NAME/KEY: misc.sub.-- feature                                             (B) LOCATION: 3597..4262                                                      (D) OTHER INFORMATION: /label=ORF4                                            /note= "Open Reading Frame #4 of DNA sequence. This                           information is repeated in SEQ ID NO:21 due to                                overlapping ORFs."                                                            (ix) FEATURE:                                                                 (A) NAME/KEY: misc.sub.-- feature                                             (B) LOCATION: 1..4603                                                         (D) OTHER INFORMATION: /note= "Four open reading frames                       (ORFs) were identified within this DNA sequence as described                  in Example 18 of the specification."                                          (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17:                                      GCATGCCGTGACCTCCGCCGGTGGCGTGGCCGCCGGCCTGCACCTGGAAACCACCCCTGA60                CGACGTCAGCGAGTGCGCTTCCGATGCCGCCGGCCTGCATCAGGTCGCCAGCCGCTACAA120               AAGCCTGTGCGACCCGCGCCTGAACCCCTGGCAAGCCATTACTGCGGTGATGGCCTGGAA180               AAACCAGCCCTCTTCAACCCTTGCCTCCTTTTGACTGGAGTTTGTCGTCATGACC235                    MetThr                                                                        GGCATTCCATCGATCGTCCCTTACGCCTTGCCTACCAACCGCGACCTG283                           GlyIleProSerIleValProTyrAlaLeuProThrAsnArgAspLeu                              51015                                                                         CCCGTCAACCTCGCGCAATGGAGCATCGACCCCGAGCGTGCCGTGCTG331                           ProValAsnLeuAlaGlnTrpSerIleAspProGluArgAlaValLeu                              202530                                                                        CTGGTGCATGACATGCAGCGCTACTTCCTGCGGCCCTTGCCCGACGCC379                           LeuValHisAspMetGlnArgTyrPheLeuArgProLeuProAspAla                              35404550                                                                      CTGCGTGACGAAGTCGTGAGCAATGCCGCGCGCATTCGCCAGTGGGCT427                           LeuArgAspGluValValSerAsnAlaAlaArgIleArgGlnTrpAla                              556065                                                                        GCCGACAACGGCGTTCCGGTGGCCTACACCGCCCAGCCCGGCAGCATG475                           AlaAspAsnGlyValProValAlaTyrThrAlaGlnProGlySerMet                              707580                                                                        AGCGAGGAGCAACGCGGGCTGCTCAAGGACTTCTGGGGCCCGGGCATG523                           SerGluGluGlnArgGlyLeuLeuLysAspPheTrpGlyProGlyMet                              859095                                                                        AAGGCCAGCCCCGCCGACCGCGAGGTGGTCGGCGCCCTGACGCCCAAG571                           LysAlaSerProAlaAspArgGluValValGlyAlaLeuThrProLys                              100105110                                                                     CCCGGCGACTGGCTGCTGACCAAGTGGCGCTACAGCGCGTTCTTCAAC619                           ProGlyAspTrpLeuLeuThrLysTrpArgTyrSerAlaPhePheAsn                              115120125130                                                                  TCCGACCTGCTGGAACGCATGCGCGCCAACGGGCGCGATCAGTTGATC667                           SerAspLeuLeuGluArgMetArgAlaAsnGlyArgAspGlnLeuIle                              135140145                                                                     CTGTGCGGGGTGTACGCCCATGTCGGGGTACTGATTTCCACCGTGGAT715                           LeuCysGlyValTyrAlaHisValGlyValLeuIleSerThrValAsp                              150155160                                                                     GCCTACTCCAACGATATCCAGCCGTTCCTCGTTGCCGACGCGATCGCC763                           AlaTyrSerAsnAspIleGlnProPheLeuValAlaAspAlaIleAla                              165170175                                                                     GACTTCAGCAAAGAGCACCACTGGATGCCATCGAATACGCCGCCAGCC811                           AspPheSerLysGluHisHisTrpMetProSerAsnThrProProAla                              180185190                                                                     GTTGCGCCATGTCATCACCACCGACGAGGTGGTGCTATGAGCCAGACC859                           ValAlaProCysHisHisHisArgArgGlyGlyAlaMetSerGlnThr                              195200205210                                                                  GCAGCCCACCTCATGGAACGCATCCTGCAACCGGCTCCCGAGCCGTTT907                           AlaAlaHisLeuMetGluArgIleLeuGlnProAlaProGluProPhe                              215220225                                                                     GCCCTGTTGTACCGCCCGGAATCCAGTGGCCCCGGCCTGCTGGACGTG955                           AlaLeuLeuTyrArgProGluSerSerGlyProGlyLeuLeuAspVal                              230235240                                                                     CTGATCGGCGAAATGTCGGAACCGCAGGTCCTGGCCGATATCGACTTG1003                          LeuIleGlyGluMetSerGluProGlnValLeuAlaAspIleAspLeu                              245250255                                                                     CCTGCCACCTCGATCGGCGCGCCTCGCCTGGATGTACTGGCGCTGATC1051                          ProAlaThrSerIleGlyAlaProArgLeuAspValLeuAlaLeuIle                              260265270                                                                     CCCTACCGCCAGATCGCCGAACGCGGTTTCGAGGCGGTGGACGATGAG1099                          ProTyrArgGlnIleAlaGluArgGlyPheGluAlaValAspAspGlu                              275280285290                                                                  TCGCCGCTGCTGGCGATGAACATCACCGAGCAGCAATCCATCAGCATC1147                          SerProLeuLeuAlaMetAsnIleThrGluGlnGlnSerIleSerIle                              295300305                                                                     GAGCGCTTGCTGGGAATGCTGCCCAACGTGCCGATCCAGTTGAACAGC1195                          GluArgLeuLeuGlyMetLeuProAsnValProIleGlnLeuAsnSer                              310315320                                                                     GAACGCTTCGACCTCAGCGACGCGAGCTACGCCGAGATCGTCAGCCAG1243                          GluArgPheAspLeuSerAspAlaSerTyrAlaGluIleValSerGln                              325330335                                                                     GTGATCGCCAATGAAATCGGCTCCGGGGAAGGCGCCAACTTCGTCATC1291                          ValIleAlaAsnGluIleGlySerGlyGluGlyAlaAsnPheValIle                              340345350                                                                     AAACGCACCTTCCTGGCCGAGATCAGCGAATACGGCCCGGCCAGTGCG1339                          LysArgThrPheLeuAlaGluIleSerGluTyrGlyProAlaSerAla                              355360365370                                                                  CTGTCGTTCTTTCGCCATCTGCTGGAACGGGAGAAAGGCGCCTACTGG1387                          LeuSerPhePheArgHisLeuLeuGluArgGluLysGlyAlaTyrTrp                              375380385                                                                     ACGTTCATCATCCACACCGGCAGCCGTACCTTCGTGGGTGCGTCCCCC1435                          ThrPheIleIleHisThrGlySerArgThrPheValGlyAlaSerPro                              390395400                                                                     GAGCGCCACATCAGCATCAAGGATGGGCTCTCGGTGATGAACCCCATC1483                          GluArgHisIleSerIleLysAspGlyLeuSerValMetAsnProIle                              405410415                                                                     AGCGGCACTTACCGCTATCCGCCCGCCGGCCCCAACCTGTCGGAAGTC1531                          SerGlyThrTyrArgTyrProProAlaGlyProAsnLeuSerGluVal                              420425430                                                                     ATGGACTTCCTGGCGGATCGCAAGGAAGCCGACGAGCTCTACATGGTG1579                          MetAspPheLeuAlaAspArgLysGluAlaAspGluLeuTyrMetVal                              435440445450                                                                  GTGGATGAAGAGCTGTAAATGATGGCGCGCATTTGTGAGGACGGCGGC1627                          ValAspGluGluLeuMetMetAlaArgIleCysGluAspGlyGly                                 4551510                                                                       CACGTCCTCGGCCCTTACCTCAAGGAAATGGCGCACCTGGCCCACACC1675                          HisValLeuGlyProTyrLeuLysGluMetAlaHisLeuAlaHisThr                              152025                                                                        GAGTACTTCATCGAAGGCAAGACCCATCGCGATGTACGGGAAATCCTG1723                          GluTyrPheIleGluGlyLysThrHisArgAspValArgGluIleLeu                              303540                                                                        CGCGAAACCCTGTTTGCGCCCACCGTCACCGGCAGCCCACTGGAAAGC1771                          ArgGluThrLeuPheAlaProThrValThrGlySerProLeuGluSer                              455055                                                                        GCCTGCCGGGTCATCCAGCGCTATGANCCGCAAGGCCGCGCGTACTAC1819                          AlaCysArgValIleGlnArgTyrXaaProGlnGlyArgAlaTyrTyr                              606570                                                                        AGCGGCATGGCTGCGCTGATCGGCAGCGATGGCAAGGGCGGGCGTTCC1867                          SerGlyMetAlaAlaLeuIleGlySerAspGlyLysGlyGlyArgSer                              75808590                                                                      CTGGACTCCGCGATCCTGATTCGTACCGCCGACATCGATAACAGCGGC1915                          LeuAspSerAlaIleLeuIleArgThrAlaAspIleAspAsnSerGly                              95100105                                                                      GAGGTGCGGATCAGCGTGGGCTCGACCATCGTGCGCCATTCCGACCCG1963                          GluValArgIleSerValGlySerThrIleValArgHisSerAspPro                              110115120                                                                     ATGACCGAGGCTGCCGAAAGCCGGGCCAAGGCCACTGGCCTGATCAGC2011                          MetThrGluAlaAlaGluSerArgAlaLysAlaThrGlyLeuIleSer                              125130135                                                                     GCACTGAAAAACCAGGCGCCCTCGCGCTTCGGCAATCACCTGCAAGTG2059                          AlaLeuLysAsnGlnAlaProSerArgPheGlyAsnHisLeuGlnVal                              140145150                                                                     CGCGCCGCATTGGCCAGCCGCAATGCCTACGTCTCGGACTTCTGGCTG2107                          ArgAlaAlaLeuAlaSerArgAsnAlaTyrValSerAspPheTrpLeu                              155160165170                                                                  ATGGACAGCCAGCAGCGGGAGCAGATCCAGGCCGACTTCAGTGGGCGC2155                          MetAspSerGlnGlnArgGluGlnIleGlnAlaAspPheSerGlyArg                              175180185                                                                     CAGGTGCTGATCGTCGACGCCGAAGACACCTTCACCTCGATGATCGCC2203                          GlnValLeuIleValAspAlaGluAspThrPheThrSerMetIleAla                              190195200                                                                     AAGCAACTGCGGGCCCTGGGCCTGGTAGTGACGGTGTGCAGCTTCAGC2251                          LysGlnLeuArgAlaLeuGlyLeuValValThrValCysSerPheSer                              205210215                                                                     GACGAATACAGCTTTGAAGGCTACGACCTGGTCATCATGGGCCCCGGC2299                          AspGluTyrSerPheGluGlyTyrAspLeuValIleMetGlyProGly                              220225230                                                                     CCCGGCAACCCGAGCGAAGTCCAACAGCCGAAAATCAACCACCTGCAC2347                          ProGlyAsnProSerGluValGlnGlnProLysIleAsnHisLeuHis                              235240245250                                                                  GTGGCCATCCGCTCCTTGCTCAGCCAGCAGCGGCCATTCCTCGCGGTG2395                          ValAlaIleArgSerLeuLeuSerGlnGlnArgProPheLeuAlaVal                              255260265                                                                     TGCCTGAGCCATCAGGTGCTGAGCCTGTGCCTGGGCCTGGAACTGCAG2443                          CysLeuSerHisGlnValLeuSerLeuCysLeuGlyLeuGluLeuGln                              270275280                                                                     CGCAAAGCCATTCCCAACCAGGGCGTGCAAAAACAGATCGACCTGTTT2491                          ArgLysAlaIleProAsnGlnGlyValGlnLysGlnIleAspLeuPhe                              285290295                                                                     GGCAATGTCGAACGGGTGGGTTTCTACAACACCTTCGCCGCCCAGAGC2539                          GlyAsnValGluArgValGlyPheTyrAsnThrPheAlaAlaGlnSer                              300305310                                                                     TCGAGTGACCGCCTGGACATCGACGGCATCGGCACCGTCGAAATCAGC2587                          SerSerAspArgLeuAspIleAspGlyIleGlyThrValGluIleSer                              315320325330                                                                  CGCGACAGCGAGACCGGCGAGGTGCATGCCCTGCGTGGCCCCTCGTTC2635                          ArgAspSerGluThrGlyGluValHisAlaLeuArgGlyProSerPhe                              335340345                                                                     GCCTCCATGCAGTTTCATGCCGAGTCGCTGCTGACCCAGGAAGGTCCG2683                          AlaSerMetGlnPheHisAlaGluSerLeuLeuThrGlnGluGlyPro                              350355360                                                                     CGCATCATCGCCGACCTGCTGCGGCACGCCCTGATCCACACACCTGTC2731                          ArgIleIleAlaAspLeuLeuArgHisAlaLeuIleHisThrProVal                              365370375                                                                     GAGAACAACGCTTCGGCCGCCGGGAGATAACCATGCACCATTACGTC2778                           GluAsnAsnAlaSerAlaAlaGlyArgMetHisHisTyrVal                                    38038515                                                                      ATCATCGACGCCTTTGCCAGCGTCCCGCTGGAAGGCAATCCGGTCGCG2826                          IleIleAspAlaPheAlaSerValProLeuGluGlyAsnProValAla                              101520                                                                        GTGTTCTTTGACGCCGATGACTTGTCGGCCGAGCAAATGCAACGCATT2874                          ValPhePheAspAlaAspAspLeuSerAlaGluGlnMetGlnArgIle                              253035                                                                        GCCCGGGAGATGAACCTGTCGGAAACCACTTTCGTGCTCAAGCCACGT2922                          AlaArgGluMetAsnLeuSerGluThrThrPheValLeuLysProArg                              404550                                                                        AACTGCGGCGATGCGCTGATCCGGATCTTCACCCCGGTCAACGAACTG2970                          AsnCysGlyAspAlaLeuIleArgIlePheThrProValAsnGluLeu                              556065                                                                        CCCTTCGCCGGGCACCCGTTGCTGGGCACGGACATTGCCCTGGGTGCG3018                          ProPheAlaGlyHisProLeuLeuGlyThrAspIleAlaLeuGlyAla                              70758085                                                                      CGCACCGACAATCACCGGCTGTTCCTGGAAACCCAGATGGGCACCATC3066                          ArgThrAspAsnHisArgLeuPheLeuGluThrGlnMetGlyThrIle                              9095100                                                                       GCCTTTGAGCTGGAGCGCCAGAACGGCAGCGTCATCGCCGCCAGCATG3114                          AlaPheGluLeuGluArgGlnAsnGlySerValIleAlaAlaSerMet                              105110115                                                                     GACCAGCCGATACCGACCTGGACGGCCCTGGGGCGCGACGCCGAGTTG3162                          AspGlnProIleProThrTrpThrAlaLeuGlyArgAspAlaGluLeu                              120125130                                                                     CTCAAGGCCCTGGGCATCAGCGACTCGACCTTTCCCATCGAGATCTAT3210                          LeuLysAlaLeuGlyIleSerAspSerThrPheProIleGluIleTyr                              135140145                                                                     CACAACGGCCCGCGTCATGTGTTTGTCGGCCTGCCAAGCATCGCCGCG3258                          HisAsnGlyProArgHisValPheValGlyLeuProSerIleAlaAla                              150155160165                                                                  CTGTCGGCCCTGCACCCCGACCACCGTGCCCTGTACAGCTTCCACGAC3306                          LeuSerAlaLeuHisProAspHisArgAlaLeuTyrSerPheHisAsp                              170175180                                                                     ATGGCCATCAACTGTTTTGCCGGTGCGGGACGGCGCTGGCGCAGCCGG3354                          MetAlaIleAsnCysPheAlaGlyAlaGlyArgArgTrpArgSerArg                              185190195                                                                     ATGTTCTCGCCGGCCTATGGGGTGGTCGAGGATGCGNCCACGGGCTCC3402                          MetPheSerProAlaTyrGlyValValGluAspAlaXaaThrGlySer                              200205210                                                                     GCTGCCGGGCCCTTGGCGATCCATCTGGCGCGGCATGGCCAGATCGAG3450                          AlaAlaGlyProLeuAlaIleHisLeuAlaArgHisGlyGlnIleGlu                              215220225                                                                     TTCGGCCAGCAGATCGAAATTCTTCAGGGCGTGGAAATCGGCCGCCCC3498                          PheGlyGlnGlnIleGluIleLeuGlnGlyValGluIleGlyArgPro                              230235240245                                                                  TCACTCATGTTCGCCCGGGCCGAGGGCCGCGCCGATCAACTGACGCGG3546                          SerLeuMetPheAlaArgAlaGluGlyArgAlaAspGlnLeuThrArg                              250255260                                                                     GTCGAAGTATCAGGCAATGGCATCACCTTCGGACGGGGGACCATCGTT3594                          ValGluValSerGlyAsnGlyIleThrPheGlyArgGlyThrIleVal                              265270275                                                                     CTATGAACAGTTCAGTACTAGGCAAGCCGCTGTTGGGTAAAGGCATGTCGGAA3647                     Leu                                                                           TCGCTGACCGGCACACTGGATGCGCCGTTCCCCGAGTACCAGAAGCCGCCTGCCGATCCC3707              ATGAGCGTGCTGCACAACTGGCTCGAACGCGCACGCCGCGTGGGCATCCGCGAACCCCGT3767              GCGCTGGCGCTGGCCACGGCTGACAGCCAGGGCCGGCCTTCGACACGCATCGTGGTGATC3827              AGTGAGATCAGTGACACCGGGGTGCTGTTCAGCACCCATGCCGGAAGCCAGAAAGGCCGC3887              GAACTGACAGAGAACCCCTGGGCCTCGGGGACGCTGTATTGGCGCGAAACCAGCCAGCAG3947              ATCATCCTCAATGGCCAGGCCGTGCGCATGCCGGATGCCAAGGCTGACGAGGCCTGGTTG4007              AAGCGCCCTTATGCCACGCATCCGATGTCATCGGTGTCTCGCCAGAGTGAAGAACTCAAG4067              GATGTTCAAGCCATGCGCAACGCCGCCAGGGAACTGGCCGAGGTTCAAGGTCCGCTGCCG4127              CGTCCCGAGGGTTATTGCGTGTTTGAGTTACGGCTTGAATCGCTGGAGTTCTGGGGTAAC4187              GGCGAGGAGCGCCTGCATGAACGCTTGCGCTATGACCGCAGCGCTGAAGGCTGGAAACAT4247              CGCCGGTTACAGCCATAGGGTCCCGCGATAAACATGCTTTGAAGTGCCTGGCTGCTCCAG4307              CTTCGAACTCATTGCGCAAACTTCAACACTTATGACACCCGGTCAACATGAGAAAAGTCC4367              AGATGCGAAAGAACGCGTATTCGAAATACCAAACAGAGAGTCCGGATCACCAAAGTGTGT4427              AACGACATTAACTCCTATCTGAATTTTATAGTTGCTCTAGAACGTTGTCCTTGACCCAGC4487              GATAGACATCGGGCCAGAACCTACATAAACAAAGTCAGACATTACTGAGGCTGCTACCAT4547              GCTAGATTTTCAAAACAAGCGTAAATATCTGAAAAGTGCAGAATCCTTCAAAGCTT4603                  (2) INFORMATION FOR SEQ ID NO:18:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 455 amino acids                                                   (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: protein                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:                                      MetThrGlyIleProSerIleValProTyrAlaLeuProThrAsnArg                              151015                                                                        AspLeuProValAsnLeuAlaGlnTrpSerIleAspProGluArgAla                              202530                                                                        ValLeuLeuValHisAspMetGlnArgTyrPheLeuArgProLeuPro                              354045                                                                        AspAlaLeuArgAspGluValValSerAsnAlaAlaArgIleArgGln                              505560                                                                        TrpAlaAlaAspAsnGlyValProValAlaTyrThrAlaGlnProGly                              65707580                                                                      SerMetSerGluGluGlnArgGlyLeuLeuLysAspPheTrpGlyPro                              859095                                                                        GlyMetLysAlaSerProAlaAspArgGluValValGlyAlaLeuThr                              100105110                                                                     ProLysProGlyAspTrpLeuLeuThrLysTrpArgTyrSerAlaPhe                              115120125                                                                     PheAsnSerAspLeuLeuGluArgMetArgAlaAsnGlyArgAspGln                              130135140                                                                     LeuIleLeuCysGlyValTyrAlaHisValGlyValLeuIleSerThr                              145150155160                                                                  ValAspAlaTyrSerAsnAspIleGlnProPheLeuValAlaAspAla                              165170175                                                                     IleAlaAspPheSerLysGluHisHisTrpMetProSerAsnThrPro                              180185190                                                                     ProAlaValAlaProCysHisHisHisArgArgGlyGlyAlaMetSer                              195200205                                                                     GlnThrAlaAlaHisLeuMetGluArgIleLeuGlnProAlaProGlu                              210215220                                                                     ProPheAlaLeuLeuTyrArgProGluSerSerGlyProGlyLeuLeu                              225230235240                                                                  AspValLeuIleGlyGluMetSerGluProGlnValLeuAlaAspIle                              245250255                                                                     AspLeuProAlaThrSerIleGlyAlaProArgLeuAspValLeuAla                              260265270                                                                     LeuIleProTyrArgGlnIleAlaGluArgGlyPheGluAlaValAsp                              275280285                                                                     AspGluSerProLeuLeuAlaMetAsnIleThrGluGlnGlnSerIle                              290295300                                                                     SerIleGluArgLeuLeuGlyMetLeuProAsnValProIleGlnLeu                              305310315320                                                                  AsnSerGluArgPheAspLeuSerAspAlaSerTyrAlaGluIleVal                              325330335                                                                     SerGlnValIleAlaAsnGluIleGlySerGlyGluGlyAlaAsnPhe                              340345350                                                                     ValIleLysArgThrPheLeuAlaGluIleSerGluTyrGlyProAla                              355360365                                                                     SerAlaLeuSerPhePheArgHisLeuLeuGluArgGluLysGlyAla                              370375380                                                                     TyrTrpThrPheIleIleHisThrGlySerArgThrPheValGlyAla                              385390395400                                                                  SerProGluArgHisIleSerIleLysAspGlyLeuSerValMetAsn                              405410415                                                                     ProIleSerGlyThrTyrArgTyrProProAlaGlyProAsnLeuSer                              420425430                                                                     GluValMetAspPheLeuAlaAspArgLysGluAlaAspGluLeuTyr                              435440445                                                                     MetValValAspGluGluLeu                                                         450455                                                                        (2) INFORMATION FOR SEQ ID NO:19:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 387 amino acids                                                   (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: protein                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19:                                      MetMetAlaArgIleCysGluAspGlyGlyHisValLeuGlyProTyr                              151015                                                                        LeuLysGluMetAlaHisLeuAlaHisThrGluTyrPheIleGluGly                              202530                                                                        LysThrHisArgAspValArgGluIleLeuArgGluThrLeuPheAla                              354045                                                                        ProThrValThrGlySerProLeuGluSerAlaCysArgValIleGln                              505560                                                                        ArgTyrXaaProGlnGlyArgAlaTyrTyrSerGlyMetAlaAlaLeu                              65707580                                                                      IleGlySerAspGlyLysGlyGlyArgSerLeuAspSerAlaIleLeu                              859095                                                                        IleArgThrAlaAspIleAspAsnSerGlyGluValArgIleSerVal                              100105110                                                                     GlySerThrIleValArgHisSerAspProMetThrGluAlaAlaGlu                              115120125                                                                     SerArgAlaLysAlaThrGlyLeuIleSerAlaLeuLysAsnGlnAla                              130135140                                                                     ProSerArgPheGlyAsnHisLeuGlnValArgAlaAlaLeuAlaSer                              145150155160                                                                  ArgAsnAlaTyrValSerAspPheTrpLeuMetAspSerGlnGlnArg                              165170175                                                                     GluGlnIleGlnAlaAspPheSerGlyArgGlnValLeuIleValAsp                              180185190                                                                     AlaGluAspThrPheThrSerMetIleAlaLysGlnLeuArgAlaLeu                              195200205                                                                     GlyLeuValValThrValCysSerPheSerAspGluTyrSerPheGlu                              210215220                                                                     GlyTyrAspLeuValIleMetGlyProGlyProGlyAsnProSerGlu                              225230235240                                                                  ValGlnGlnProLysIleAsnHisLeuHisValAlaIleArgSerLeu                              245250255                                                                     LeuSerGlnGlnArgProPheLeuAlaValCysLeuSerHisGlnVal                              260265270                                                                     LeuSerLeuCysLeuGlyLeuGluLeuGlnArgLysAlaIleProAsn                              275280285                                                                     GlnGlyValGlnLysGlnIleAspLeuPheGlyAsnValGluArgVal                              290295300                                                                     GlyPheTyrAsnThrPheAlaAlaGlnSerSerSerAspArgLeuAsp                              305310315320                                                                  IleAspGlyIleGlyThrValGluIleSerArgAspSerGluThrGly                              325330335                                                                     GluValHisAlaLeuArgGlyProSerPheAlaSerMetGlnPheHis                              340345350                                                                     AlaGluSerLeuLeuThrGlnGluGlyProArgIleIleAlaAspLeu                              355360365                                                                     LeuArgHisAlaLeuIleHisThrProValGluAsnAsnAlaSerAla                              370375380                                                                     AlaGlyArg                                                                     385                                                                           (2) INFORMATION FOR SEQ ID NO:20:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 278 amino acids                                                   (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: protein                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:20:                                      MetHisHisTyrValIleIleAspAlaPheAlaSerValProLeuGlu                              151015                                                                        GlyAsnProValAlaValPhePheAspAlaAspAspLeuSerAlaGlu                              202530                                                                        GlnMetGlnArgIleAlaArgGluMetAsnLeuSerGluThrThrPhe                              354045                                                                        ValLeuLysProArgAsnCysGlyAspAlaLeuIleArgIlePheThr                              505560                                                                        ProValAsnGluLeuProPheAlaGlyHisProLeuLeuGlyThrAsp                              65707580                                                                      IleAlaLeuGlyAlaArgThrAspAsnHisArgLeuPheLeuGluThr                              859095                                                                        GlnMetGlyThrIleAlaPheGluLeuGluArgGlnAsnGlySerVal                              100105110                                                                     IleAlaAlaSerMetAspGlnProIleProThrTrpThrAlaLeuGly                              115120125                                                                     ArgAspAlaGluLeuLeuLysAlaLeuGlyIleSerAspSerThrPhe                              130135140                                                                     ProIleGluIleTyrHisAsnGlyProArgHisValPheValGlyLeu                              145150155160                                                                  ProSerIleAlaAlaLeuSerAlaLeuHisProAspHisArgAlaLeu                              165170175                                                                     TyrSerPheHisAspMetAlaIleAsnCysPheAlaGlyAlaGlyArg                              180185190                                                                     ArgTrpArgSerArgMetPheSerProAlaTyrGlyValValGluAsp                              195200205                                                                     AlaXaaThrGlySerAlaAlaGlyProLeuAlaIleHisLeuAlaArg                              210215220                                                                     HisGlyGlnIleGluPheGlyGlnGlnIleGluIleLeuGlnGlyVal                              225230235240                                                                  GluIleGlyArgProSerLeuMetPheAlaArgAlaGluGlyArgAla                              245250255                                                                     AspGlnLeuThrArgValGluValSerGlyAsnGlyIleThrPheGly                              260265270                                                                     ArgGlyThrIleValLeu                                                            275                                                                           (2) INFORMATION FOR SEQ ID NO:21:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 1007 base pairs                                                   (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (ix) FEATURE:                                                                 (A) NAME/KEY: CDS                                                             (B) LOCATION: 1..669                                                          (D) OTHER INFORMATION: /gene= "phz4"                                          /label= ORF4                                                                  /note= "This DNA sequence is repeated from SEQ ID                             NO:17 so that the overlapping ORF4 may be                                     separately translated"                                                        (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21:                                      ATGAACAGTTCAGTACTAGGCAAGCCGCTGTTGGGTAAAGGCATGTCG48                            MetAsnSerSerValLeuGlyLysProLeuLeuGlyLysGlyMetSer                              151015                                                                        GAATCGCTGACCGGCACACTGGATGCGCCGTTCCCCGAGTACCAGAAG96                            GluSerLeuThrGlyThrLeuAspAlaProPheProGluTyrGlnLys                              202530                                                                        CCGCCTGCCGATCCCATGAGCGTGCTGCACAACTGGCTCGAACGCGCA144                           ProProAlaAspProMetSerValLeuHisAsnTrpLeuGluArgAla                              354045                                                                        CGCCGCGTGGGCATCCGCGAACCCCGTGCGCTGGCGCTGGCCACGGCT192                           ArgArgValGlyIleArgGluProArgAlaLeuAlaLeuAlaThrAla                              505560                                                                        GACAGCCAGGGCCGGCCTTCGACACGCATCGTGGTGATCAGTGAGATC240                           AspSerGlnGlyArgProSerThrArgIleValValIleSerGluIle                              65707580                                                                      AGTGACACCGGGGTGCTGTTCAGCACCCATGCCGGAAGCCAGAAAGGC288                           SerAspThrGlyValLeuPheSerThrHisAlaGlySerGlnLysGly                              859095                                                                        CGCGAACTGACAGAGAACCCCTGGGCCTCGGGGACGCTGTATTGGCGC336                           ArgGluLeuThrGluAsnProTrpAlaSerGlyThrLeuTyrTrpArg                              100105110                                                                     GAAACCAGCCAGCAGATCATCCTCAATGGCCAGGCCGTGCGCATGCCG384                           GluThrSerGlnGlnIleIleLeuAsnGlyGlnAlaValArgMetPro                              115120125                                                                     GATGCCAAGGCTGACGAGGCCTGGTTGAAGCGCCCTTATGCCACGCAT432                           AspAlaLysAlaAspGluAlaTrpLeuLysArgProTyrAlaThrHis                              130135140                                                                     CCGATGTCATCGGTGTCTCGCCAGAGTGAAGAACTCAAGGATGTTCAA480                           ProMetSerSerValSerArgGlnSerGluGluLeuLysAspValGln                              145150155160                                                                  GCCATGCGCAACGCCGCCAGGGAACTGGCCGAGGTTCAAGGTCCGCTG528                           AlaMetArgAsnAlaAlaArgGluLeuAlaGluValGlnGlyProLeu                              165170175                                                                     CCGCGTCCCGAGGGTTATTGCGTGTTTGAGTTACGGCTTGAATCGCTG576                           ProArgProGluGlyTyrCysValPheGluLeuArgLeuGluSerLeu                              180185190                                                                     GAGTTCTGGGGTAACGGCGAGGAGCGCCTGCATGAACGCTTGCGCTAT624                           GluPheTrpGlyAsnGlyGluGluArgLeuHisGluArgLeuArgTyr                              195200205                                                                     GACCGCAGCGCTGAAGGCTGGAAACATCGCCGGTTACAGCCATAGGGTCCCG676                       AspArgSerAlaGluGlyTrpLysHisArgArgLeuGlnPro                                    210215220                                                                     CGATAAACATGCTTTGAAGTGCCTGGCTGCTCCAGCTTCGAACTCATTGCGCAAACTTCA736               ACACTTATGACACCCGGTCAACATGAGAAAAGTCCAGATGCGAAAGAACGCGTATTCGAA796               ATACCAAACAGAGAGTCCGGATCACCAAAGTGTGTAACGACATTAACTCCTATCTGAATT856               TTATAGTTGCTCTAGAACGTTGTCCTTGACCCAGCGATAGACATCGGGCCAGAACCTACA916               TAAACAAAGTCAGACATTACTGAGGCTGCTACCATGCTAGATTTTCAAAACAAGCGTAAA976               TATCTGAAAAGTGCAGAATCCTTCAAAGCTT1007                                           (2) INFORMATION FOR SEQ ID NO:22:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 222 amino acids                                                   (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: protein                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22:                                      MetAsnSerSerValLeuGlyLysProLeuLeuGlyLysGlyMetSer                              151015                                                                        GluSerLeuThrGlyThrLeuAspAlaProPheProGluTyrGlnLys                              202530                                                                        ProProAlaAspProMetSerValLeuHisAsnTrpLeuGluArgAla                              354045                                                                        ArgArgValGlyIleArgGluProArgAlaLeuAlaLeuAlaThrAla                              505560                                                                        AspSerGlnGlyArgProSerThrArgIleValValIleSerGluIle                              65707580                                                                      SerAspThrGlyValLeuPheSerThrHisAlaGlySerGlnLysGly                              859095                                                                        ArgGluLeuThrGluAsnProTrpAlaSerGlyThrLeuTyrTrpArg                              100105110                                                                     GluThrSerGlnGlnIleIleLeuAsnGlyGlnAlaValArgMetPro                              115120125                                                                     AspAlaLysAlaAspGluAlaTrpLeuLysArgProTyrAlaThrHis                              130135140                                                                     ProMetSerSerValSerArgGlnSerGluGluLeuLysAspValGln                              145150155160                                                                  AlaMetArgAsnAlaAlaArgGluLeuAlaGluValGlnGlyProLeu                              165170175                                                                     ProArgProGluGlyTyrCysValPheGluLeuArgLeuGluSerLeu                              180185190                                                                     GluPheTrpGlyAsnGlyGluGluArgLeuHisGluArgLeuArgTyr                              195200205                                                                     AspArgSerAlaGluGlyTrpLysHisArgArgLeuGlnPro                                    210215220                                                                     __________________________________________________________________________

What is claimed is:
 1. An isolated DNA molecule encoding one or morepolypeptides required for the biosynthesis of pyrrolnitrin, wherein saidDNA molecule has a nucleotide sequence selected from the followinggroup: SEQ ID NO:1; ORF1 of SEQ ID NO:1; ORF2 of SEQ ID NO:1; ORF3 ofSEQ ID NO:1; and ORF4 of SEQ ID NO:1.
 2. An isolated DNA moleculeencoding one or more polypeptides required in the biosynthetic pathwayof pyrrolnitrin in a host, wherein said one or more polypeptides haveamino acid sequences selected from SEQ ID NO's: 2-5.
 3. An expressionvector comprising the isolated DNA molecule of claim
 2. 4. A plant hosttransformed with the expression vector of claim
 3. 5. A bacterial hosttransformed with the expression vector of claim
 3. 6. The bacterial hostof claim 5, wherein said host is a Pseudomonad.
 7. The bacterial host ofclaim 5, wherein said host is E. coli.